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THE BASICS 


We won’t make this section of the text too long — all we really want to do here is to take 
a short memory-jogging excursion through little bits and pieces you should remember 
about sets and numbers. The material in this chapter will not be (directly) examined. 


0.1 4 Numbers 


Before we do anything else, it is very important that we agree on the definitions and names 
of some important collections of numbers. 


e Natural numbers — These are the “whole numbers” 1,2,3,...that we learn first at 
about the same time as we learn the alphabet. We will denote this collection of 
numbers by the symbol “IN”. The symbol IN is written in a type of bold-face font that 
we call “black-board bold” (and is definitely not the same symbol as N). You should 
become used to writing a few letters in this way since it is typically used to denote 
collections of important numbers. Unfortunately there is often some confusion as to 
whether or not zero should be included, In this text the natural numbers does not 
include zero. 


Notice that the set of natural numbers is closed under addition and multiplication. 
This means that if you take any two natural numbers and add them you get another 
natural number. Similarly if you take any two natural numbers and multiply them 
you get another natural number. However the set is not closed under subtraction or 
division; we need negative numbers and fractions to make collections of numbers 
closed under subtraction and division. 


Two important subsets of natural numbers are: 


1 This lack of agreement comes from some debate over how “natural” zero is — “how can nothing be d 


something?” It was certainly not used by the ancient Greeks who really first looked proof and number. 
If you are a mathematician then generally 0 is not a natural number. If you are a computer scientist 
then 0 generally is. 
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— Prime numbers — a natural number is prime when the only natural numbers 
that divide it exactly are 1 and itself. Equivalently it cannot be written as the 
product of two natural numbers neither of which are 1. Note that 1 is not a 
prime number’. 


— Composite numbers — a natural number is a composite number when it is not 
prime. 


Hence the number 7 is prime, but 6 = 3 x 2 is composite. 


e Integers — all positive and negative numbers together with the number zero. We 
denote the collection of all integers by the symbol “Z”. Again, note that this is not 
the same symbol as “Z”, and we must write it in the same black-board bold font. 
The Z stands for the German Zahlen meaning numbers®. Note that Z is closed under 
addition, subtraction and multiplication, but not division. 


Two important subsets of integers are: 


— Even numbers — an integer is even if it is exactly divisible by 2, or equivalently 
if it can be written as the product of 2 and another integer. This means that 
—14,6 and 0 are all even. 


— Odd numbers — an integer is odd when it is not even. Equivalently it can be 
written as 2k + 1 where k is another integer. Thus 11 = 2x5+1 and —7 = 
2 x (—4) + 1 are both odd. 


e Rational numbers — this is all numbers that can be written as the ratio of two inte- 
gers. That is, any rational number r can be written as p/q where p,q are integers. 
We denote this collection by Q standing for quoziente which is Italian for quotient or 
ratio. Now we finally have a set of numbers which is closed under addition, subtrac- 
tion, multiplication and division (of course you still need to be careful not to divide 
by zero). 


e Real numbers — generally we think of these numbers as numbers that can be written 
as decimal expansions and we denote it by R. It is beyond the scope of this text to 
go into the details of how to give a precise definition of real numbers, and the notion 
that a real number can be written as a decimal expansion will be sufficient. 


It took mathematicians quite a long time to realise that there were numbers that 


2 If you let 1 be a prime number then you have to treat 1 x 2 x 3 and 2 x 3 as different factorisations of d 


the number 6. This causes headaches for mathematicians, so they don’t let 1 be prime. 
3 Some schools (and even some provinces!!) may use “I” for integers, but this is extremely non-standard 
and they really should use correct notation. 


THE BASICS 0.1 NUMBERS 


could not be written as ratios of integers*. The first numbers that were shown to 
be not-rational are square-roots of prime numbers, like V2. Other well known ex- 
amples are 7 and e. Usually the fact that some numbers cannot be represented as 
ratios of integers is harmless because those numbers can be approximated by ratio- 
nal numbers to any desired precision. 


The reason that we can approximate real numbers in this way is the surprising fact 
that between any two real numbers, one can always find a rational number. So if we 
are interested in a particular real number we can always find a rational number that 
is extremely close. Mathematicians refer to this property by saying that Q is dense in 
R. 


So to summarise 


This is not really a definition, but you should know these symbols 


e IN = the natural numbers, 
e Z = the integers, 
e O = the rationals, and 


e IR = the reals. 


» More on real numbers 


In the preceding paragraphs we have talked about the decimal expansions of real numbers 
and there is just one more point that we wish to touch on. The decimal expansions of 
rational numbers are always periodic, that is the expansion eventually starts to repeat itself. 
For example 


2 
— =0.1 oe 
5 0.133333333 


7 = 0.294117647058823529411764705882352941176470588235294117647058823 ... 


4 The existence of such numbers caused mathematicians (particularly the ancient Greeks) all sorts of d 


philosophical problems. They thought that the natural numbers were somehow fundamental and 
beautiful and “natural”. The rational numbers you can get very easily by taking “ratios” — a pro- 
cess that is still somehow quite sensible. There were quite influential philosophers (in Greece at least) 
called Pythagoreans (disciples of Pythagoras originally) who saw numbers as almost mystical objects 
explaining all the phenomena in the universe, including beauty — famously they found fractions in 
musical notes etc and “numbers constitute the entire heavens”. They believed that everything could be 
explained by whole numbers and their ratios. But soon after Pythagoras’ theorem was discovered, so 
were numbers that are not rational. The first proof of the existence of irrational numbers is sometimes 
attributed to Hippasus in around 400BCE (not really known). It seems that his philosopher “friends” 
were not very happy about this and essentially exiled him. Some accounts suggest that he was drowned 
by them. 
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where we have underlined some of the last example to make the period clearer. On the 
other hand, irrational numbers, such as V2 and 7r, have expansions that never repeat. 

If we want to think of real numbers as their decimal expansions, then we need those 
expansions to be unique. That is, we don’t want to be able to write down two different 
expansions, each giving the same real number. Unfortunately there are an infinite set of 
numbers that do not have unique expansions. Consider the number 1. We usually just 
write “1”, but as a decimal expansion it is 


1.00000000000. . . 
that is, a single 1 followed by an infinite string of 0’s. Now consider the following number 
0.99999999999 ... 


This second decimal expansions actually represents the same number — the number 1. 
Let’s prove this. First call the real number this represents q, then 


7 = 099999999999... 
Let’s use a little trick to get rid of the long string of trailing 9’s. Consider 109: 
@f = 0.99999999999 .... 
10q = 9.99999999999 ... 
If we now subtract one from the other we get 
9q = 9.0000000000... 


and so we are left with q = 1.0000000.... So both expansions represent the same real 
number. 

Thankfully this sort of thing only happens with rational numbers of a particular form 
— those whose denominators are products of 2s and 5s. For example 


= = 1.200000: -- = 1.19999999.... 
-z — _0.2187500000--- = —0.2187499999..... 


= = 0.45000000 - - - = 0.4499999 .... 


We can formalise this result in the following theorem (which we haven’t proved in general, 
but it’s beyond the scope of the text to do so): 


Theorem 0.1.2. 


Let x be a real number. Then x must fall into one of the following two categories, 
e x has a unique decimal expansion, or 


e xisarational number of the form ast where a € Z and k,/ are non-negative 
integers. 


In the second case, x has exactly two expansions, one that ends in an infinite 
string of 9’s and the other ending in an infinite string of 0’s. 
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When we do have a choice of two expansions, it is usual to avoid the one that ends in 
an infinite string of 9’s and write the other instead (omitting the infinite trailing string of 
0’s). 


0.2 4 Sets 


All of you will have done some basic bits of set-theory in school. Sets, intersection, unions, 
Venn diagrams etc etc. Set theory now appears so thoroughly throughout mathematics 
that it is difficult to imagine how Mathematics could have existed without it. It is really 
quite surprising that set theory is a much newer part of mathematics than calculus. Math- 
ematically rigorous set theory was really only developed in the 19th Century — primarily 
by Georg Cantor°. Mathematicians were using sets before then (of course), however they 
were doing so without defining things too rigorously and formally. 

In mathematics (and elsewhere, including “real life”) we are used to dealing with col- 
lections of things. For example 


e a family is a collection of relatives. 
e hockey team is a collection of hockey players. 
e shopping list is a collection of items we need to buy. 


Generally when we give mathematical definitions we try to make them very formal 
and rigorous so that they are as clear as possible. We need to do this so that when we 
come across a mathematical object we can decide with complete certainty whether or not 
it satisfies the definition. 

Unfortunately, it is the case that giving a completely rigorous definition of “set” would 
take up far more of our time than we would really like®. 


A “set” is a collection of objects. The objects are referred to as “elements” or 


“members” of the set. 


Now — just a moment to describe some conventions. There are many of these in 
mathematics. These are not firm mathematical rules, but just traditions. It makes it much 
easier for people reading your work to understand what you are trying to say. 


An extremely interesting mathematician who is responsible for much of our understanding of infinity. 

Arguably his most famous results are that there are more real numbers than integers, and that there are 

an infinite number of different infinities. His work, though now considered to be extremely important, 
was not accepted by his peers, and he was labelled “a corrupter of youth” for teaching it. For some 
reason we know that he spent much of his honeymoon talking and doing mathematics with Richard 
Dedekind. 

6 The interested reader is invited to google (or whichever search engine you prefer — DuckDuckGo?) 
“Russell’s paradox”, “Axiomatic set theory” and “Zermelo-Fraenkel set theory” for a more complete 
and far more detailed discussion of the basics of sets and why, when you dig into them a little, they are 
not so basic. 
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e Use capital letters to denote sets, A, B,C, X,Y etc. 
e Use lower case letters to denote elements of the sets a,b,c, x,y. 


So when you are writing up homework, or just describing what you are doing, then if you 
stick with these conventions people reading your work (including the person marking 
your exams) will know — “Oh A is that set they are talking about” and “a is an element 
of that set.”. On the other hand, if you use any old letter or symbol it is correct, but 
confusing for the reader. Think of it as being a bit like spelling — if you don’t spell words 
correctly people can usually still understand what you mean, but it is much easier if you 
spell words the same way as everyone else. 
We will encounter more of these conventions as we go — another good one is 


e The letters 1, j,k,1,m,n usually denote integers (like 1,2,3, —5,18,...). 


e The letters x, y,z,w usually denote real numbers (like 1.4323, zz, 4/2, 6.0221415 x 107°... 
and so forth). 


So now that we have defined sets, what can we do with them? There is only thing we 
can ask of a set 


“Ts this object in the set?” 
and the set will answer 
“yes” or “No” 


For example, if A is the set of even numbers we can ask “Is 4 in A?” We get back the 
answer “yes”. We write this as 


4eA 


While if we ask “Is 3 in A?”, we get back the answer “no”. Mathematically we would 
write this as 


3¢A 


So this symbol “e” is mathematical shorthand for “is an element of”, while the same sym- 
bol with a stroke through it “¢” is shorthand for “is not an element of”. 

Notice that both of these statements, though they are written down as short strings of 
three symbols, are really complete sentences. That is, when we read them out we have 


“Ae A” is read as “Four is an element of A.” 
“3 ¢ A” is read as “Three is not an element of A.” 


The mathematical symbols like “+”, “=” and “e” are shorthand’ and mathematical state- 
ments like “4 + 3 = 7” are complete sentences. 


7 Precise definitions aside, by “shorthand” we mean a collection of accepted symbols and abbreviations d 


to allow us to write more quickly and hopefully more clearly. People have been using various systems 
of shorthand as long as people have been writing. Many of these are used and understood only by the 
individual, but if you want people to be able to understand what you have written, then you need to 
use shorthand that is commonly understood. 
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This is an important point — mathematical writing is just like any other sort of writing. 
It is very easy to put a bunch of symbols or words down on the page, but if we would like 
it to be easy to read and understand, then we have to work a bit harder. When you write 
mathematics you should keep in mind that someone else should be able to read it and 
understand it. 


Easy reading is damn hard writing. 
Nathaniel Hawthorne, but possibly also a few others like Richard Sheridan. 


We will come across quite a few different sets when doing when doing mathematics. 
It must be completely clear from the definition how to answer the question “Is this object 
in the set or not?” 


e “Let A be the set of even integers between 1 and 13.” — nice and clear. 


e “Let B be the set of tall people in this class room.” — not clear. 


More generally if there are only a small number of elements in the set we just list them all 
out 


e “Let C = {1,2,3}.” 
When we write out the list we put the elements inside braces “{-}”. Note that the order 
we write things in doesn’t matter 


C = {1,2,3} = {2,1,3} = {3,2,1} 


because the only thing we can ask is “Is this object an element of C?” We cannot ask more 
complex questions like “What is the third element of C?” — we require more sophisticated 
mathematical objects to ask such questions®. Similarly, it doesn’t matter how many times 
we write the same object in the list 7 


C= 111, 1,2,073,5,6-1,2.),2,10) j=4 428) 


because all we ask is “Is 1 € C?”. Not “how many times is 1 in C?”. 
Now — if the set is a bit bigger then we might write something like this 


e C= {1,2,3,...,40} the set of all integers between 1 and 40 (inclusive). 
e A= {1,4,9,16,...} the set of all perfect squares” 


The “...” is again shorthand for the missing entries. You have to be careful with this as 
you can easily confuse the reader 


e B = {3,5,7,...} — is this all odd primes, or all odd numbers bigger than 1 or ?? 
What is written is not sufficient for us to have a firm idea of what the writer intended. 


Only use this where it is completely clear by context. A few extra words can save the 
reader (and yourself) a lot of confusion. 


Always think about the reader. 


8 The interested reader is invited to look at “lists”, “multisets”, “totally ordered sets” and “partially d 


ordered sets” amongst many other mathematical objects that generalise the basic idea of sets. 
9 ie integers that can be written as the square of another integer. 


7 
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0.3 4 Other important sets 


We have seen a few important sets above — namely IN, Z,Q and IR. However, arguably 
the most important set in mathematics is the empty set. 


The empty set (or null set or void set) is the set which contains no elements. It is 


denoted ©. For any object x, we always have x ¢ @; hence @ = {}. 


Note that it is important to realise that the empty set is not nothing; think of it as an 
empty bag. Also note that with quite a bit of hard work you can actually define the natural 
numbers in terms of the empty set. Doing so is very formal and well beyond the scope of 
this text. 

When a set does not contain too many elements it is fine to specify it by listing out its 
elements. But for infinite sets or even just big sets we can’t do this and instead we have to 
give the defining rule. For example the set of all perfect square numbers we write as 


S ={xs.t. x =k* wherek € Z} 


Notice we have used another piece of short-hand here, namely s.t., which stands for 
“such that” or “so that”. We read the above statement as “S is the set of elements x such 
that x equals k-squared where k is an integer”. This is the standard way of writing a set 
defined by a rule, though there are several shorthands for “such that”. We shall use two 
them: 


P= {ps.t. pis prime} = {p| p is prime} 


Other people also use “:” is shorthand for “such that”. You should recognise all three of 
these shorthands. 

In this text we will use “|” or “ s.t.” (being the preference of the authors), but some 
other texts use “:”. You should recognise all of these. 


fa Example 0.3.2 (examples of set) P$ 


Even more examples. ee 


e Let A = {2,3,5,7,11,13,17,19} and let 
B= {ae Ala < 8} = {2,3,5,7} 
the set of elements of A that are strictly less than 8. 
e Even and odd integers 


E = {n\n is an even integer} 
= {n|n = 2k for some k € Z} 
= {2n|ne Z}, 
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and similarly 


O = {n|n is an odd integer} 


= {2n+1|ne Z}. 
e Square integers 
S = {n?|n € Z}. 
The set!? Ss’ = {n?\n € IN} is not the same as S because S’ does not contain the 


number 0, which is definitely a square integer and 0 is in S. We could also write 
S = {n?|ne Z,n > 0} and $ = {n?|n = 0,1,2,... }. 


tC Example 0.3.2 __J 


The sets A and B in the above example illustrate an important point. Every element in 
B is an element in A, and so we say that B is a subset of A 


Let A and B be sets. We say “A is a subset of B” if every element of A is also an 
element of B. We denote this A € B (or B 2 A). If A isa subset of B and A and B 


are not the same , so that there is some element of B that is not in A then we say 
that A is a proper subset of B. We denote this by A < B (or B 5 A). 


Two things to note about subsets: 
e Let A be aset. It is always the case that © C A. 


e If A is not a subset of B then we write A € B. This is the same as saying that there is 
some element of A that is not in B. That is, there is some a € A such that a ¢ B. 


Example 0.3.4 (subsets) 


Let S = {1,2}. What are all the subsets of S? Well — each element of S can either be in 
the subset or not (independent of the other elements of the set). So we have 2 x 2 = 4 
possibilities: neither 1 nor 2 is in the subset, 1 is but 2 is not, 2 is but 1 is not, and both 1 
and 2 are. That is 


@,{1}, {2}, {1,2} oS 


10 Notice here we are using another common piece of mathematical short-hand. Very often in mathematics d 


we will be talking or writing about some object, like the set S above, and then we will create a closely 
related object. Rather than calling this new object by a new symbol (we could have used T or R or...), 
we instead use the same symbol but with some sort of accent — such as the little single quote mark we 
added to the symbol S to make S’ (read “S prime”). The point of this is to let the reader know that this 
new object is related to the original one, but not the same. You might also see S,5,5,§ and others. 


2 
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This argument can be generalised with a little work to show that a set that contains exactly 
n elements has exactly 2” subsets. 


Example 0.3.4 =) 


In much of our work with functions later in the text we will need to work with subsets 
of real numbers, particularly segments of the “real line”. A convenient and standard way 
of representing such subsets is with interval notation. 


Let a,b € R such that a < b. We name the subset of all numbers between a and b 
in different ways depending on whether or not the ends of the interval (a and b) 
are elements of the subset. 


e The closed interval [a,b] = {x € R:a <x < b} — both end points are 
included. 


e The open interval (a,b) = {x ¢ R: a < x < b} — neither end point is 
included. 


We also define half-open" intervals which contain one end point but not the 


other: 
(Gol ve Roa 2 [tO jee Rg = x = bh 
We sometimes also need unbounded intervals 


(go) =a = Rig <a 
(00,b) = {xe R: x <}} 


These unbounded intervals do not include “0”, so that end of the interval is 
always open’. 


>» More on sets 


So we now know how to say that one set is contained within another. We will now define 
some other operations on sets. Let us also start to be a bit more precise with our definitions 
and set them out carefully as we get deeper into the text. 


11 Also called “half-closed”. The preference for one term over the other may be related to whether a 500m1 d 


glass containing 250ml of water is half-full or half-empty. 

12 Infinity is not a real number. As mentioned in an earlier footnote, Cantor proved that there are an 
infinite number of different infinities and so it is incorrect to think of «0 as being a single number. As 
such it cannot be an element in an interval of the real line. We suggest that the reader that wants to learn 
more about how mathematics handles infinity look up transfinite numbers and transfinite arithmetic. 
Needless to say these topics are beyond the scope of this text. 
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Let A and B be sets. We define the union of A and B, denoted A vu B, to be the set 
of all elements that are in at least one of A or B. 


AUB = {x|xe Aorxe B} 


It is important to realise that we are using the word “or” in a careful mathematical 
sense. We mean that x belongs to A or x belongs to B or both. Whereas in normal every- 
day English “or” is often used to be “exclusive or” — A or B but not both’. 

We also start the definition by announcing “Definition” so that the reader knows “We 
are about to define something important”. We should also make sure that everything is 
(reasonably) self-contained — we are not assuming the reader already knows A and B are 
sets. 

It is vital that we make our definitions clear otherwise anything we do with the defini- 
tions will be very difficult to follow. As writers we must try to be nice to our readers. 


Let A and B be sets. We define the intersection of A and B, denoted A 1 B, to be 
the set of elements that belong to both A and B. 


AnB={x|xeAandxe B} 


Again note that we are using the word “and” in a careful mathematical sense (which 
is pretty close to the usual use in English). 


Example 0.3.8 (Union and intersection) 


Let A = {1,2,3,4}, B = {p: pis prime}, C = {5,7,9} and D = {even positive integers}. 
Then 


Aic.B = {2,5} 
BAD=2} 

Aig = 11,2,4,5,7,9} 
AnCcC=2 


In this last case we see that the two sets have no elements in common — they are said to 


be disjoint. 
A , Example 038}—_J 
13. When you are asked for your dining preferences on a long flight you are usually asked something like d 


“Chicken or beef?” — you get one or the other, but not both. Unless you are way at the back near the 
toilets in which case you will be presented with which ever meal was less popular. Probably fish. 

14 If you are finding this text difficult to follow then please complain to us authors and we will do our best 
to improve it. 
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0.4 4 Functions 


Now that we have reviewed basic ideas about sets we can start doing more interesting 
things with them — functions. 

When we are introduced to functions in mathematics, it is almost always as formulas. 
We take a number x and do some things to it to get anew number y. For example, 


y = f(x) =3x-7 


Here, we take a number x, multiply it by 3 and then subtract seven to get the result. 

This view of functions — a function is a formula — was how mathematicians defined 
them up until the 19th century. As basic ideas of sets became better defined, people revised 
ideas surrounding functions. The more modern definition of a function between two sets 
is that it is a rule which assigns to each element of the first set a unique element of the 
second set. 

Consider the set of days of the week, and the set containing the alphabet 


A = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday} 
B = {ab,c,d,e,...,x,y,z} 


We can define a function f that takes a day (that is, an element of A) and turns it into the 
first letter of that day (that is, an element of B). This is a valid function, though there is no 
formula. We can draw a picture of the function as 


Figure 0.4.1. 


Tuesday 
Wednesday | 


Clearly such pictures will work for small sets, but will get very messy for big ones. 
When we shift back to talking about functions on real numbers, then we will switch to 
using graphs of functions on the Cartesian plane. 

This example is pretty simple, but this serves to illustrate some important points. If 
our function gives us a rule for taking elements in A and turning them into elements from 
B then 


e the function must be defined for all elements of A — that is, no matter which element 
of A we choose, the function must be able to give us an answer. Every function must 
have this property. 
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e on the other hand, we don’t have to “hit” every element from B. In the above exam- 
ple, we miss almost all the letters in B. A function that does reach every element of 
B is said to be “surjective” or “onto”. 


e a given element of B may be reached by more than one element of A. In the above 
example, the days “Tuesday” and “Thursday” both map to the letter T and similarly 
the letters S is mapped to by both “Sunday” and “Saturday”. A function which does 
not do this, that is, every element in A maps to a different element in B is called 
“injective” or “one-to-one” — again we will come back to this later when we discuss 
inverse function in Section 0.6. 


Summarising this more formally, we have 


Let A, B be non-empty sets. A function f from A to B, is a rule or formula that 


takes elements of A as inputs and returns elements of B as outputs. We write this 
as 


f:A-B 


and if f takes a € A as an input and returns b € B then we write this as f(a) = b. 
Every function must satisfy the following two conditions 


e The function must be defined on every possible input from the set A. That 
is, no matter which element a € A we choose, the function must return an 
element b € B so that f(a) = b. 


e The function is only allowed to return one result for each input). So if we 
find that f(a) = b; and f(a) = bo then the only way that f can be a function 
is if by is exactly the same as bp. 


We must include the input and output sets A and B in the definition of the function. 
This is one of the reasons that we should not think of functions as just formulas. The input 
and output sets have proper mathematical names, which we give below: 


15 You may have learnt this in the context of plotting functions on the Cartesian plane, as “the vertical line d 


test”. If the graph intersects a vertical line twice, then the same x-value will give two y-values and so 
the graph does not represent a function. 
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Let f : A > B bea function. Then 
e the set A of inputs to our function is the “domain” of f, 
e the set B which contains all the results is called the codomain, 


e We read “f (a) = b” as“ f of ais b”, but sometimes we might say “f maps a 
to b” or “b is the image of a”. 


e The codomain must contain all the possible results of the function, but it 
might also contain a few other elements. The subset of B that is exactly the 
outputs of A is called the “range” of f. We define it more formally by 


range of f = {b € B | there is some ae A so that f(a) = b} 
= (f(a) € Blac A} 


The only elements allowed in that set are those elements of B that are the 
images of elements in A. 


Example 0.4.3 (domains and ranges) 


Let us go back to the “days of the week” function example that we worked on above, we 
can define the domain, codomain and range: 


e The domain, A, is the set of days of the week. 
e The codomain, B, is the 26 letters of the alphabet. 


e The range is the set {F,M,T,S,W} — no other elements of B are images of inputs 
from A. 


tC Example 043;,_J 


om Example 0.4.4 (more domains and ranges) 


A more numerical example — let g : IR — R be defined by the formula g(x) = x”. Then 
e the domain and codomain are both the set of all real numbers, but 
e the range is the set |0,00). 
Now — let h : [0,«©) — [0,00) be defined by the formula h(x) = \/x. Then 


e the domain and codomain are both the set [0, 0), that is all non-negative real num- 
bers, and 


e in this case the range is equal to the codomain, namely [0, «). 
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tC Example 0.4.4 _J 
- Example 0.4.5 (piece-wise function) ——=—— | 


Yet another numerical example. 


V:[-11]-R defined by V(t) = . a 

120 if0<t<1 
This is an example of a “piece-wise” function — that is, one that is not defined by a single 
formula, but instead defined piece-by-piece. This function has domain [—1,1] and its 
range is {0,120}. We could interpret this function as measuring the voltage across a switch 
that is flipped on at time t = 0. 


ae Example 0.4.5 J 


Almost all the functions we look at from here on will be formulas. However it is 
important to note, that we have to include the domain and codomain when we describe 
the function. If the domain and codomain are not stated explicitly then we should assume 
that both are R. 


0.5 4 Parsing formulas 


Consider the formula 


_ 1+x 
~ 142x— x2 


f(x) 


This is an example of a simple rational function — that is, the ratio of two polynomials. 
When we start to examine these functions later in the text, it is important that we are able 
to understand how to evaluate such functions at different values of x. For example 

13 6 6) 


f) = 749=35 = 14. «7 


More important, however, is that we understand how we decompose this function into 
simpler pieces. Since much of your calculus course will involve creating and studying 
complicated functions by building them up from simple pieces, it is important that you 
really understand this point. 

Now to get there we will take a small excursion into what are called parse-trees. You 
already implicitly use these when you evaluate the function at a particular value of x, but 
our aim here is to formalise this process a little more. 

We can express the steps used to evaluate the above formula as a tree-like diagram'®. 
We can decompose this formula as the following tree-like diagram ae 


16 Such trees appear in many areas of mathematics and computer science. The reason for the name is that d 


they look rather like trees — starting from their base they grow and branch out towards their many 
leaves. For some reason, which remains mysterious, they are usually drawn upside down. 
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Figure 0.5.1. 


Let us explain the pieces here. 


e The picture consists of boxes and arrows which are called “nodes” and “edges” re- 
spectively. 


e There are two types of boxes, those containing numbers and the variable x, and those 
containing arithmetic operations “+”,“—", “x” and “/”. 


e If we wish to represent the formula 3 + 5, then we can draw this as the following 
cherry-like configuration 
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e By stringing such little “cherries” together we can describe more complicated for- 
mulas. For example, if we compute “(3 + 5) x 2”, we first compute “(3 + 5)” and 
then multiply the result by 2. The corresponding diagrams are 


The tree we drew in Figure 0.5.1 above representing our formula has x in some of the 


boxes, and so when we want to compute the function at a particular value of x — say at 
x = 5 — then we replace those “x’”s in the tree by that value and then compute back up 
the tree. See the example below 
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Figure 0.5.2. 


ae 
, eS 
Filaa® 


and we are done. 


This is not the only parse tree associated with the formula for f(x); we could also 
decompose it as 
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Figure 0.5.3. 


We are able to do this because when we compute the denominator 1 + 2x — x”, we can 


compute it as 


1424-2" = either (14-2%) = 27 of = 14-2 = 2°), 
Both!” are correct because addition is “associative”. Namely 
a+b+c=(a+b)+c=a+(b+c). 
Multiplication is also associative: 


REE RES (ARG) KOS Oe ee) 


om Example 0.5.1 (parsing a formal) P—$£$£ 


Consider the formula 
t+7 . t+ 
elt) = (= 4 -sin (= ) ; 


This introduces a new idea — we have to evaluate 47 and then compute the sine of that 


number. The corresponding tree can be written as 


17 We could also use, for example, 1 + 2x — x? = (1 — x?) + 2x. i 
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If we want to evaluate this at tf = 71/2 then we get the following... 


Oe 
ras 


ee l 
CO ee le 
Paes 


and we are done. 


tC Example 0.5.1 =) 


It is highly unlikely that you will ever need to explicitly construct such a tree for any 
problem in the remainder of the text. The main point of introducing these objects and 
working through a few examples is to realise that all the functions that we will examine are 
constructed from simpler pieces. In particular we have constructed all the above examples 
from simple “building blocks” 


e constants — fixed numbers like 1, 7t and so forth 
e variables — usually x or t, but sometimes other symbols 


e standard functions — like trigonometric functions (sine, cosine and tangent), expo- 
nentials and logarithms. 
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These simple building blocks are combined using arithmetic 
e addition and subtraction — a+ banda-—b 
e multiplication and division — a - b and 4/b 
e raising to a power — a” 


e composition — given two functions f(x) and g(x) we form a new function f(¢(x)) 
by evaluating y = g(x) and then evaluating f(y) = f(g(x)). 


During the rest of the course when we learn how to compute limits and derivatives, our 
computations require us to understand the way we construct functions as we have just 
described. 

That is, in order to compute the derivative!® of a function we have to see how to con- 
struct the function from these building blocks (i.e. the constants, variables and standard 
functions) using arithmetic operations. We will then construct the derivative by follow- 
ing these same steps. There will be simple rules for finding the derivatives of the simpler 
pieces and then rules for putting them together following the arithmetic used to construct 
the function. 


0.6 4 Inverse functions 


There is one last thing that we should review before we get into the main material of 
the course and that is inverse functions. As we have seen above functions are really just 
rules for taking an input (almost always a number), processing it somehow (usually by a 
formula) and then returning an output (again, almost always a number). 


inputnumberx + f does“stuff’tox + return number y 


In many situations it will turn out to be very useful if we can undo whatever it is that our 
function has done. ie 


take output +> do “stuff” to +» return the original x 
y y 8 


When it exists, the function “which undoes” the function f(x) is found by solving y = 
f(x) for x as a function of y and is called the inverse function of f. It turns out that it is 
not always possible to solve y = f(x) for x as a function of y. Even when it is possible, it 
can be really hard to do’. 

For example — a particle’s position, s, at time t is given by the formula s(t) = 7¢ 
(sketched below). Given a calculator, and any particular number t, you can quickly work 
out the corresponding positions s. However, if you are asked the question “When does 
the particle reach s = 4?” then to answer it we need to be able to “undo” s(t) = 4 to 


18 We get to this in Chapter 2 — don’t worry about exactly what it is just now. i 


19 Indeed much of encryption exploits the fact that you can find functions that are very quick to do, but 
very hard to undo. For example — it is very fast to multiply two large prime numbers together, but very 
hard to take that result and factor it back into the original two primes. The interested reader should 
look up trapdoor functions. 
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isolate t. In this case, because s(t) is always increasing, we can always undo s(t) to get a 
unique answer: 


4 
s(t) =7t =4 if and only if c= z 


However, this question is not always so easy. Consider the sketch of y = sin(x) below; 
when is y = 5? That is, for which values x is sin(x) = 3? To rephrase it again, at which 
values of x does the curve y = sinx (which is sketched in the right half of Figure 0.6.1) 
cross the horizontal straight line y = 5 (which is also sketched in the same figure)? 


Figure 0.6.1. 


We can see that there are going to be an infinite number of x-values that give y = 
sin(x) = 5; there is no unique answer. 
Recall (from Definition 0.4.1) that a for any given input, a function must give a unique 


output. So if we want to find a function that undoes s(t), then things are good — because 
each s-value corresponds to a unique t-value. On the other hand, the situation with y = 
sin x is problematic — any given y-value is mapped to by many different x-values. So 
when we look for an unique answer to the question “When is sinx = 5?” we cannot 
answer it. 

This “uniqueness” condition can be made more precise: 


A function f is one-to-one (injective) when it never takes the same y value more 
than once. That is 


if xy # x2 then f(x1) # f (x2) 


There is an easy way to test this when you have a plot of the function — the horizontal 
line test. 
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A function is one-to-one if and only if no horizontal line y = c intersects the 


graph y = f(x) more than once. 


i.e. every horizontal line intersects the graph either zero or one times. Never twice or 
more. This test tell us that y = x° is one-to-one, but y = x* is not. However note that if we 
restrict the domain of y = x* to x > 0 then the horizontal line test is passed. This is one of 
the reasons we have to be careful to consider the domain of the function. 


Figure 0.6.2. 


When a function is one-to-one then it has an inverse function. 


Let f be a one-to-one function with domain A and range B. Then its inverse 
function is denoted f~! and has domain B and range A. It is defined by 


dx whenever ea =u 
for any y € B. 


So if f maps x to y, then f—! maps y back to x. That is f~! “undoes” f. Because of this 
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we have 
fG(*)) =x foranyxe A 
1G) =y for any ye B 


sa ok ; 
We have to be careful not to confuse f~!(x) with ——.. The “—1” is not an exponent. 


f(x) 


Example 0.6.4 


Let f(x) = x° +3 on domain R. To find its inverse we do the following 
e Write y = f(x); that is y = x° +3. 
e Solve for x in terms of y (this is not always easy) — x° = y—3, so x = (y—3)/9, 
e The solution is f-!(y) = (y—3)/°. 


e Recall that the “y” in f~!(y) isa dummy variable. That is, f-!(y) = (y—3)!/° means 
that if you feed the number y into the function f~! it outputs the number (y —3)!”°. 
You may call the input variable anything you like. So if you wish to call the input 


PLLA PLA 


variable “x” instead of “y” then just replace every y in f—!(y) with an x. 


e That is f-!(x) = (x3). 
tC Example 0.6.4 __J 


Example 0.6.5 


Let g(x) = Vx —10n the domain x > 1. We can find the inverse in the same way: 


y=vx-1 
y? =x-l1 
sey Maly or, writing input variable as “x”: 


Capi 065} J 


Let us now turn to finding the inverse of sin(x) — it is a little more tricky and we have 
to think carefully about domains. 


Example 0.6.6 


We have seen (back in Figure 0.6.1) that sin(x) takes each value y between —1 and +1 for 
infinitely many different values of x (see the left-hand graph in the figure below). Conse- 
quently sin(x), with domain —o < x < « does not have an inverse function. 
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arcsin(x) } 


But notice that as x runs from —4 to +4, sin(x) increases from —1 to +1. (See the middle 
graph in the figure above.) In particular, sin(x) takes each value —1 < y < 1 for exactly 
one —5 < x < %. So if we restrict sinx to have domain —5 < x < 4, it does have an 
inverse function, which is traditionally called arcsine (see Appendix A.9). 

That is, by definition, for each —1 < y < 1, arcsin(y) is the unique —F < x < $ obeying 
sin(x) = y. Equivalently, exchanging the dummy variables x and y throughout the last 
sentence gives that for each —1 < x < 1, arcsin(x) is the unique —5 < y < 4 obeying 


Siti = 2 


Ce ampie 066, 
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26 


So very roughly speaking, “Differential Calculus” is the study of how a function changes 
as its input changes. The mathematical object we use to describe this is the “derivative” of 
a function. To properly describe what this thing is we need some machinery; in particular 
we need to define what we mean by “tangent” and “limit”. We'll get back to defining the 
derivative in Chapter 2. 


1.1 4 Drawing tangents and a first limit 


Our motivation for developing “limit” — being the title and subject of this chapter — is 
going to be two related problems of drawing tangent lines and computing velocity. 

Now — our treatment of limits is not going to be completely mathematically rigorous, 
so we won't have too many formal definitions. There will be a few mathematically precise 
definitions and theorems as we go, but we’ll make sure there is plenty of explanation 
around them. 

Let us start with the “tangent line” problem. Of course, we need to define “tangent”, 
but we won't do this formally. Instead let us draw some pictures. 


Figure 1.1.1. 


Tangent to the 
curve at this point 


Not a tangent line 


x 


Here we have drawn two very rough sketches of the curve y = x* for x > 0. These are 
not very good sketches for a couple of reasons 
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e The curve in the figure does not pass through (0,0), even though (0,0) lies on y = x?. 


e The top-right end of the curve doubles back on itself and so fails the vertical line test 
that all functions must satisfy! — for each x-value there is exactly one y-value for 
which (x,y) lies on the curve y = x’. 


So let’s draw those more carefully. 


Figure 1.1.2. 


Tangent to the P 


curve at this point Not a tangent line 


XL 


Sketches of the curve y = x’. (left) shows a tangent line, while (right) shows a line that is 
not a tangent. 


These are better. In both cases we have drawn y = x? (carefully) and then picked a 
point on the curve — call it P. Let us zoom in on the “good” example: 


Figure 1.1.3. 


We see that, the more we zoom in on the point P, the more the graph of the function 
(drawn in black) looks like a straight line — that line is the tangent line (drawn in blue). 


We see that as we zoom in on the point P, the graph of the function looks more and 
more like a straight line. If we kept on zooming in on P then the graph of the function 
would be indistinguishable from a straight line. That line is the tangent line (which we 


1 Take a moment to go back and reread Definition 0.4.1. i 
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have drawn in blue). A little more precisely, the blue line is “the tangent line to the func- 
tion at P’”. We have to be a little careful, because if we zoom in at a different point, then 
we will find a different tangent line. 

Now lets zoom in on the “bad” example we see that the blue line looks very different 
from the function; because of this, the blue line is not the tangent line at P. 


Figure 1.1.4. 


Zooming in on P we see that the function (drawn in black) looks more and more like a 
straight line — however it is not the same line as that drawn in blue. Because of this the 
blue line is not the tangent line. 


Here are a couple more examples of tangent lines 


Figure 1.1.5. 


Y 


Tangent to the 
curve at this point 
(distant 
Tangent to the intersection 
_/ curve at this point 


More examples of tangle lines. 


The one the left is very similar to the good example on y = x” that we saw above, 
while the one on the right is different — it looks a little like the “bad” example, in that it 
crosses our function the curve at some distant point. Why is the line in Figure 1.1.5(right) 
a tangent while the line in Figure 1.1.2(right) not a tangent? To see why, we should again 
zoom in close to the point where we are trying to draw the tangent. 
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Figure 1.1.6. 


As we saw above in Figure 1.1.4, when we zoom in around our example of “not a tan- 
gent line” we see that the straight line looks very different from the curve at the “point 
of tangency” — i.e. where we are trying to draw the tangent. The line drawn in Fig- 
ure 1.1.5(right) looks more and more like the function as we zoom in. 


This example raises an important point — when we are trying to draw a tangent line, 
we don’t care what the function does a long way from the point; the tangent line to the 
curve at a particular point P, depends only on what the function looks like close to that 
point P. 

To illustrate this consider the sketch of the function y = sin(x) and its tangent line 
at (x,y) = (0,0): 


Figure 1.1.7. 


As we zoom in, the graph of sin(x) looks more and more like a straight line — in 
fact it looks more and more like the line y = x. We have also sketched this tangent line. 
What makes this example a little odd is that the tangent line crosses the function. In the 
examples above, our tangent lines just “kissed” the curve and did not cross it (or at least 
did not cross it nearby). 

Using this idea of zooming in at a particular point, drawing a tangent line is not too 
hard. However, finding the equation of the tangent line presents us with a few challenges. 
Rather than leaping into the general theory, let us do a specific example. Let us find the 
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the equation of the tangent line to the curve y = x7 at the point P with coordinates” 


(x,y) = (1,1). 


To find the equation of a line we either need 
e the slope of the line and a point on the line, or 
e two points on the line, from which we can compute the slope via the formula 


gp ete 
XQ — X41 


and then write down the equation for the line via a formula such as 
y=m-(x-—%)4+ 1. 


We cannot use the first method because we do not know what the slope of the tangent 
line should be. To work out the slope we need calculus — so we'll be able to use this 
method once we get to the next chapter on “differentiation”. 

It is not immediately obvious how we can use the second method, since we only have 
one point on the curve, namely (1,1). However we can use it to “sneak up” on the answer. 
Let’s approximate the tangent line, by drawing a line that passes through (1,1) and some 
nearby point — call it Q. Here is our recipe: 


e Weare given the point P = (1,1) and we are told 
Find the tangent line to the curve y = x? that passes through P = (1,1). 


e We don’t quite know how to find a line given just 1 point, however we do know how 
to find a line passing through 2 points. So pick another point on the curves whose 
coordinates are very close to P. Now rather than picking some actual numbers, I am 
going to write our second point as Q = (1+h,(1+h)?). That is, a point Q whose 
x-coordinate is equal to that of P plus a little bit — where the little bit is some small 
number h. And since this point lies on the curve y = x7, and Q’s x-coordinate is 
1 +h, Q’s y-coordinate must be (1 + 1). 


If having h as an variable rather than a number bothers you, start by thinking of h 
as 0.1. 


e A picture of the situation will help. 


2 Note that the coordinates (x,y) is an ordered pair of two numbers x and y. Traditionally the first number d 


is called the abscissa while the second is the ordinate, but these terms are a little archaic. It is now much 
more common to hear people refer to the first number as the x-coordinate and the second as y-coordinate. 
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Figure 1.1.8. 


2 


approximation 
to tangent line 


tangent line 
we want 


e This line that passes through the curve in two places P and Q is called a “secant 
line”. 


e The slope of the line is then 


gen 

Xx. — X41 

_ (l+h?-1_14+2h+h’-1_ 2h+h _,, 
~ (1+h)-1 h a 


where we have expanded (1 +h)? = 1+ 2h + h? and then cleaned up a bit. 


Now this isn’t our tangent line because it passes through 2 nearby points on the curve 
— however it is a reasonable approximation of it. Now we can make that approximation 
better and so “sneak up” on the tangent line by considering what happens when we move 
this point Q closer and closer to P. i.e. make the number / closer and closer to zero. 


Figure 1.1.9. 


approximation 
to tangent line 


tangent line 


we want tangent line 
we want 


First look at the picture. The original choice of Q is on the left, while on the right we 
have drawn what happens if we choose h’ to be some number a little smaller than h, so 


o2 
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that our point Q becomes a new point Q’ that is a little closer to P. The new approximation 
is better than the first. 

So as we make li smaller and smaller, we bring Q closer and closer to P, and make our 
secant line a better and better approximation of the tangent line. We can observe what 
happens to the slope of the line as we make h smaller by plugging some numbers into our 
formula m = 2 +h: 


h=0.1 m= 2.1 
h=0.01 m = 2.01 
h = 0.001 m = 2.001. 


So again we see that as this difference in x becomes smaller and smaller, the slope appears 
to be getting closer and closer to 2. We can write this more mathematically as 


2 
lim eR icons | 9 
h—0 h 
This is read as 


The limit, as h approaches 0, of (thes is 2. 


This is our first limit! Notice that we can see this a little more clearly with a quick bit of 
algebra: 


(l+h)*-1 (14+2h+h?)-1 


h h 
2h + h* 
= (2+h) 
h 
So it is not unreasonable to expect that 
2_ 
ncaa ee es 
h—0 h h—0 


Our tangent line can be thought of as the end of this process — namely as we bring 
Q closer and closer to P, the slope of the secant line comes closer and closer to that of the 
tangent line we want. Since we have worked out what the slope is — that is the limit we 
saw just above — we now know the slope of the tangent line is 2. Given this, we can work 
out the equation for the tangent line. 


e The equation for the line is y = mx +c. We have 2 unknowns m and c — so we need 
2 pieces of information to find them. 


e Since the line is tangent to P = (1,1) we know the line must pass through (1,1). 
From the limit we computed above, we also know that the line has slope 2. 


e Since the slope is 2 we know that m = 2. Thus the equation of the line is y = 2x +c. 


e We know that the line passes through (1,1), so that y = 2x +c must be 1 when x = 1. 
So 1 = 2-1-+-¢c, which forces c = —1. 


So our tangent line is y = 2x — 1. 
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1.2 4 Another limit and computing velocity 


Computing tangent lines is all very well, but what does this have to do with applications 
or the “Real World”? Well - at least initially our use of limits (and indeed of calculus) is 
going to be a little removed from real world applications. However as we go further and 
learn more about limits and derivatives we will be able to get closer to real problems and 
their solutions. 

So stepping just a little closer to the real world, consider the following problem. You 
drop a ball from the top of a very very tall building. Let t be elapsed time measured in 
seconds, and s(t) be the distance the ball has fallen in metres. So s(0) = 0. 

Quick aside: there is quite a bit going on in the statement of this problem. We have 
described the general picture — tall building, ball, falling — but we have also introduced 
notation, variables and units. These will be common first steps in applications and are 
necessary in order to translate a real world problem into mathematics in a clear and con- 
sistent way. 

Galileo® worked out that s(t) is a quadratic function: 


s(t) = 4.98. 
The question that is posed is 
How fast is the ball falling after 1 second? 


Now before we get to answering this question, we should first be a little more precise. 
The wording of this question is pretty sloppy for a couple of reasons: 


e What we do mean by “after 1 second”? We know the ball will move faster and faster 
as time passes, so after 1 second it does not fall at one fixed speed. 


e As it stands a reasonable answer to the question would be just “really fast”. If the 
person asking the question wants a numerical answer it would be better to ask “At 
what speed” or “With what velocity”. 


We should also be careful using the words “speed” and “velocity” — they are not inter- 
changeable. 


e Speed means the distance travelled per unit time and is always a non-negative num- 
ber. An unmoving object has speed 0, while a moving object has positive speed. 


e Velocity, on the other hand, also specifies the direction of motion. In this text we will 
almost exclusively deal with objects moving along straight lines. Because of this 


3 Perhaps one of the most famous experiments in all of physics is Galileo’s leaning tower of Pisa exper- d 


iment, in which he dropped two balls of different masses from the top of the tower and observed that 
the time taken to reach the ground was independent of their mass. This disproved Aristotle’s assertion 
that heavier objects fall faster. It is quite likely that Galileo did not actually perform this experiment. 
Rather it was a thought-experiment. However a quick glance at wikipedia will turn up some wonderful 
footage from the Apollo 15 mission showing a hammer and feather being dropped from equal height 
hitting the moon’s surface at the same time. Finally, Galileo determined that the speed of falling objects 
increases at a constant rate, which is equivalent to the formula stated here, but it is unlikely that he 
wrote down an equation exactly as it is here. 
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velocities will be positive or negative numbers indicating which direction the object 
is moving along the line. We will be more precise about this later’. 

A better question is 
What is the velocity of the ball precisely 1 second after it is dropped? 

or even better: 

What is the velocity of the ball at the 1 second mark? 


This makes it very clear that we want to know what is happening at exactly 1 second after 
the ball is dropped. 

There is something a little subtle going on in this question. In particular, what do we 
mean by the velocity at t = 1?. Surely if we freeze time at f = 1 second, then the object is 
not moving at all? This is definitely not what we mean. 

If an object is moving at a constant velocity” in the positive direction, then that velocity 
is just the distance travelled divided by the time taken. That is 


distance moved 


time taken 


An object moving at constant velocity that moves 27 metres in 3 seconds has velocity 


C= lle 9m/s. 
38 
When velocity is constant everything is easy. 

However, in our falling object example, the object is being acted on by gravity and its 
speed is definitely not constant. Instead of asking for THE velocity, let us examine the 
“average velocity” of the object over a certain window of time. In this case the formula is 
very similar 


distance moved 


average velocity = 
5 y time taken 


But now I want to be more precise, instead write 


difference in distance 


average velocity = 
8 y difference in time 

Now in spoken English we haven’t really changed much — the distance moved is the 

difference in position, and the time taken is just the difference in time — but the latter is 

more mathematically precise, and is easy to translate into the following equation 


s(to) ~ s(t) 


to —ty 


4 Getting the sign of velocity wrong is a very common error — you should be careful with it. d 


5 Newton’s first law of motion states that an object in motion moves with constant velocity unless a force 
acts on it — for example gravity or friction. 


average velocity = 
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This is the formula for the average velocity of our object between time f; and ty. The 
denominator is just the difference between these times and the numerator is the difference 
in position — i.e. position at time fj is just s(f,) and position at time tz is just s(tz). 

So what is the average velocity of the falling ball between 1 and 1.1 seconds? All we 
need to do now is plug some numbers into our formula 


; difference in position 
average velocity = 


difference in time 


_ (11) =s() 

— 11-1 

— 4.9(1.1)?-49(11) 49x 0.21 | 

— Kl = il = 10.29m/s 


And we have our average velocity. However there is something we should notice about 
this formula and it is easier to see if we sketch a graph of the function s(f) 


Figure 1.2.1. 


difference in 


i alae 
i poisition 
: difference in 


So on the left I have drawn the graph and noted the times t = 1 and ft = 1.1. The 
corresponding positions on the axes and the two points on the curve. On the right I have 
added a few more details. In particular I have noted the differences in position and time, 
and the line joining the two points. Notice that the slope of this line is 


changeiny _ difference ins 


slope = 
P changeinx difference in t 
which is precisely our expression for the average velocity. 
Let us examine what happens to the average velocity as we look over smaller and 
smaller time-windows. 


time window average velocity 
l<ts<ll 10.29 
1<#<1.01 9.849 
1<t<1.001 9.8049 
1<t< 1.0001 9.80049 
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As we make the time interval smaller and smaller we find that the average velocity is 
getting closer and closer to 9.8. We can be a little more precise by finding the average 
velocity between t = 1 and t = 1 + h — this is very similar to what we did for tangent 
lines. 


s(1+h) —s(1) 
(1+h)-1 

_ 49(1+h)? —49 

7 h 

_ 9.8h + 4.9h* 

7 h 

= 9.8+4.9h 


average velocity = 


Now as we squeeze this window between t = 1 and t = 1 +h down towards zero, the 
average velocity becomes the “instantaneous velocity” — just as the slope of the secant 
line becomes the slope of the tangent line. This is our second limit 


o(1) = lim s(1+h) —s(1) 


= 9.8 
h-0 h 


More generally we define the instantaneous velocity at time t = a to be the limit 


sly lim s(a+ 2 — s(a) 


We read this as 


The velocity at time a is equal to the limit as 1 goes to zero of see) 

While we have solved the problem stated at the start of this section, it is clear that if we 
wish to solve similar problems that we will need to understand limits in a more general 
and systematic way. 


1.3 4 The limit of a function 


Before we come to definitions, let us start with a little notation for limits. 


Notation 1.3.1. 


We will often write 


which should be read as 


The limit of f(x) as x approaches a is L. 
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The notation is just shorthand — we don’t want to have to write out long sentences 
as we do our mathematics. Whenever you see these symbols you should think of that 
sentence. 

This shorthand also has the benefit of being mathematically precise (we'll see this 
later), and (almost) independent of the language in which the author is writing. A mathe- 
matician who does not speak English can read the above formula and understand exactly 
what it means. 

In mathematics, like most languages, there is usually more than one way of writing 
things and we can also write the above limit as 


f(x) ~Lasx—-a 
This can also be read as above, but also as 
f(x) goes to L as x goes toa 


They mean exactly the same thing in mathematics, even though they might be written, 
read and said a little differently. 
To arrive at the definition of limit, I want to start® with a very simple example. 


Example 1.3.2 


Consider the following function. 


2x x<3 
fa=29 w=3 
Zn a 


This is an example of a piece-wise function’. That is, a function defined in several pieces, 
rather than as a single formula. We evaluate the function at a particular value of x ona 
case-by-case basis. Here is a sketch of it 


Notice the two circles in the plot. One is open, o and the other is closed e. 


e A filled circle has quite a precise meaning — a filled circle at (x,y) means that the 
function takes the value f(x) = y. 


6 Well, we had two limits in the previous sections, so perhaps we really want to “restart” with a very d 


simple example. 
7 We saw another piecewise function back in Example 0.4.5. 
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e An open circle is a little harder — an open circle at (3,6) means that the point (3,6) 
is not on the graph of y = f(x), i.e. f(3) #4 6. We should only use the open circle 
where it is absolutely necessary in order to avoid confusion. 


This function is quite contrived, but it is a very good example to start working with 
limits more systematically. Consider what the function does close to x = 3. We already 
know what happens exactly at 3 — f(x) = 9 — but I want to look at how the function 
behaves very close to x = 3. That is, what does the function do as we look at a point x 
that gets closer and closer to x = 3. 

If we plug in some numbers very close to 3 (but not exactly 3) into the function we see 
the following: 


x || 2.9] 2.99 | 2.999 ]o] 3.001 | 3.01] 3.1 
F(x) || 5.8 15.98 | 5.998 | | 6.002 | 6.02 [6.2 


So as x moves closer and closer to 3, without being exactly 3, we see that the function 
moves closer and closer to 6. We can write this as 


lim f(x) = 6 


x33 
That is 
The limit as x approaches 3 of f(x) is 6. 


So for x very close to 3, without being exactly 3, the function is very close to 6 — which 
is a long way from the value of the function exactly at 3, f(3) = 9. Note well that the 
behaviour of the function as x gets very close to 3 does not depend on the value of the 


function at 3. 
Se Example 1.3.2 =) 


We now have enough to make an informal definition of a limit, which is actually suffi- 
cient for most of what we will do in this text. 


We write 


lim f(x) =L 


xa 


when the value of the function f(x) gets closer and closer to L as x gets closer 
and closer to a (without® being exactly a). 


8 You may find the condition “without being exactly a” a little strange, but there is a good reason for it. d 


One very important application of limits, indeed the main reason we teach the topic, is in the definition 


of derivatives (see Definition 2.2.1 in the next chapter). In that definition we need to compute the limit 
lim ane In this case the function whose limit is being taken, namely Lo) fo) is not defined at 
x—a — 
all atx =a. 
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In order to make this definition more mathematically correct, we need to make the idea 
of “closer and closer” more precise — we do this in Section 1.7. It should be emphasised 
that the formal definition and the contents of that section are optional material. 

For now, let us use the above definition to examine a more substantial example. 


Example 1.3.4 


Let f(x) = >*=*~ and consider its limit as x —> 2. 
x*+x-6 


e Weare really being asked 


e Now if we try to compute f(2) we get 0/0 which is undefined. The function is not 
defined at that point — this is a good example of why we need limits. We have to 
sneak up on these places where a function is not defined (or is badly behaved). 


e VERY IMPORTANT POINT: the fraction i is not co and it is not 1, it is not de- 
fined. We cannot ever divide by zero in normal arithmetic and obtain a consistent 
and mathematically sensible answer. It you learned otherwise in high-school, you 
should quickly unlearn it. 


e Again, we can plug in some numbers close to 2 and see what we find 


x 1.9 1.99 L999. |-6.|) 2,001 2.01 | 
f(x) || 0.20408 | 0.20040 | 0.20004 | o | 0.19996 | 0.19960 | 0.19608 


e Soa reasonable to suppose that 


i x-—2 
im ——___~ 
x=2 x27 ++ x-6 


oS Example 1 34}, 


The previous two examples are nicely behaved in that the limits we tried to compute 
actually exist. We now turn to two nastier examples’ in which the limits we are interested 
in do not exist. 


Example 1.3.5 (A bad example) 


= 02 


Consider the following function f(x) = sin(7t/x). Find the limit as x > 0 of f(x). 

We should see something interesting happening close to x = 0 because f(x) is un- 
defined there. Using your favourite graph-plotting software you can see that the graph 
looks roughly like 


9 Actually, they are good examples, but the functions in them are nastier. d 
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How to explain this? As x gets closer and closer to zero, 77/x becomes larger and larger 
(remember what the plot of y = 1/x looks like). So when you take sine of that number, it 
oscillates faster and faster the closer you get to zero. Since the function does not approach 
a single number as we bring x closer and closer to zero, the limit does not exist. 

We write this as 


. 7U ; 
lim sin (<) does not exist 


x0 


Its not very inventive notation, however it is clear. We frequently abbreviate “does not 
exist” to “DNE” and rewrite the above as 


lim sin (=) = DNE 


xa 


tC Example 135. 


In the following example, the limit we are interested in does not exist. However the 
way in which things go wrong is quite different from what we just saw. 


cl Example 2) ae | 


Consider the function 


x x<2 
f(x) =4-1 v= 2 
x+3 x>2 


e The plot of this function looks like this 
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e So let us plug in numbers close to 2. 


x || 1.9] 1.99 | 1.999 | o | 2.001 | 2.01 | 2.1 
F(x) || 1.9 | 1.99 | 1.999 | [5.001 | 5.01 [5.1 


e This isn’t like before. Now when we approach from below, we seem to be getting 
closer to 2, but when we approach from above we seem to be getting closer to 5. 
Since we are not approaching the same number the limit does not exist. 


lim f(x) = DNE 
x2 


( ____eesesSs§=«_m_cpcttsoH Example 1.3.6 =) 


While the limit in the previous example does not exist, the example serves to introduce 
the idea of “one-sided limits”. For example, we can say that 


As x moves closer and closer to two from below the function approaches 2. 
and similarly 


As x moves closer and closer to two from above the function approaches 5. 


We write 


lim: f(4) =k 

xa 
when the value of f(x) gets closer and closer to K when x < a and x moves 
closer and closer to a. Since the x-values are always less than a, we say that x 
approaches a from below. This is also often called the left-hand limit since the 
x-values lie to the left of a on a sketch of the graph. 
We similarly write 


lief (ea 


x—at 


when the value of f (x) gets closer and closer to L when x > a and x moves closer 
and closer to a. For similar reasons we say that x approaches a from above, and 
sometimes refer to this as the right-hand limit. 


Note — be careful to include the superscript + and — when writing these limits. You 
might also see the following notations: 


lim f(x) = lim f(x) =lim f(x) = lim f(x) =L right-hand limit 
x—at x—>a+ xla x\a 
lim f(x) = lim f(x) = lim f(x) = lim f(x) =L left-hand limit 
x— a7 x—a— xta x/a 
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but please use with the notation in Definition 1.3.7 above. 


Given these two similar notions of limits, when are they the same? The following 
theorem tell us 


Theorem 1.3.8 (Limits and one sided limits). 


lim f(x) = ih if and only if lim f(x) = Land lim f(s) 


xa xa x—a 


Notice that this is really two separate statements because of the “if and only if” 


e If the limit of f(x) as x approaches a exists and is equal to L, then both the left-hand 
and right-hand limits exist and are equal to L. AND, 


e If the left-hand and right-hand limits as x approaches a exist and are equal, then the 
limit as x approaches a exists and is equal to the one-sided limits. 


That is — the limit of f(x) as x approaches a will only exist if it doesn’t matter which way 
we approach a (either from left or right) AND if we get the same one-sided limits when 
we approach from left and right, then the limit exists. 


We can rephrase the above by writing the contrapositive!® of the statements 


e If either of the left-hand and right-hand limits as x approaches a fail to exist, or if 
they both exist but are different, then the limit as x approaches a does not exist. 
AND, 


e If the limit as x approaches a does not exist, then the left-hand and right-hand limits 
are either different or at least one of them does not exist. 


Here is another limit example 


Example 1.3.9 


Consider the following two functions and compute their limits and one-sided limits as x 
approaches 1: 


10 Givena statement of the form “If A then B” the contrapositive is “If not B then not A”. They are logically i 


equivalent — if one is true then so is the other. We must take care not to confuse the contrapositive with 
the converse. Given “If A then B” then converse is “If B then A”. These are definitely not the same. 
To see this consider the statement “If he is Shakespeare then he is dead.” The converse is “If he is 


dead then he is Shakespeare” — clearly garbage since there are plenty of dead people who are not 
Shakespeare. The contrapositive is “If he is not dead then he is not Shakespeare” — which makes much 
more sense. 
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These are a little different from our previous examples, in that we do not have formulas, 
only the sketch. But we can still compute the limits. 


e Function on the left — f(x): 
lin fej 2 lim f(x) =2 


x—1- x—1t 


so by the previous theorem 
lime F(x) = 2 


x1 
e Function on the right — g(t): 
li ij=2 d ii t) = —-2 
ana ae 
so by the previous theorem 
lim g(t) = DNE 


Cele 139}-J 


We have seen 2 ways in which a limit does not exist — in one case the function oscil- 
lated wildly, and in the other there was some sort of “jump” in the function, so that the 
left-hand and right-hand limits were different. 

There is a third way that we must also consider. To describe this, consider the following 
four functions: 


Figure 1.3.1. 


function 1 function 2 function 3 function 4 
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None of these functions are defined at x = a, nor do the the limits as x approaches a 
exist. However we can say more than just “the limits do no exist”. 

Notice that the value of function 1 can be made bigger and bigger as we bring x closer 
and closer to a. Similarly the value of the second function can be made arbitrarily large 
and negative (i.e. make it as big a negative number as we want) by bringing x closer and 
closer to a. Based on this observation we have the following definition. 


We write 


lim f(x) = +2 
when the value of the function f(x) becomes arbitrarily large and positive as x 
gets closer and closer to a, without being exactly a. 
Similarly, we write 


lim f(x) = —oo 


x—a 


when the value of the function f(x) becomes arbitrarily large and negative as x 
gets closer and closer to a, without being exactly a. 


A good examples of the above is 


IMPORTANT POINT: Please do not think of “co” and “—” in these statements as 
numbers. The statement 


lim f(x) = +00 

does not say “the limit of f(x) as x approaches a is positive infinity. It says “the func- 
tion f(x) becomes arbitrarily large as x approaches a”. These are different statements. 
Remember that 00 is not a number=-. 

Now consider functions 3 and 4 in Figure 1.3.1. Here we can make the value of the 
function as big and positive as we want (for function 3) or as big and negative as we want 
(for function 4) but only when x approaches a from one side. With this in mind we can 
construct similar notation and a similar definition: 


11 One needs to be very careful making statements about infinity. At some point in our lives we get around i 


to asking ourselves “what is the biggest number”, and we realise there isn’t one. That is, we can go on 
counting integer after integer, for ever and not stop. Indeed the set of integers is the first infinite thing 
we really encounter. It is an example of a countably infinite set. The set of real-numbers is actually much 
bigger and is uncountably infinite. In fact there are an infinite number of different sorts of infinity! Much 
of the theory of infinite sets was developed by Georg Cantor; we mentioned him back in Section 0.2 
and he is well worth googling. a 
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We write 


lite) = ee 

x—at 
when the value of the function f(x) becomes arbitrarily large and positive as x 
gets closer and closer to a from above (equivalently — from the right), without 
being exactly a. 
Similarly, we write 


lim f(x) = —o 


x—at 


when the value of the function f(x) becomes arbitrarily large and negative as x 
gets closer and closer to a from above (equivalently — from the right), without 
being exactly a. 

The notation 


linea) j(62)) = ta26 lim f(x) = —oo 


xa Ll 


has a similar meaning except that limits are approached from below / from the 
left. 


So for function 3 we have 


lim f(x) = +0 lim f (x) = some positive number 
xa x—a 


and for function 4 


lim f(x) = some positive number lim. f(x) = —00 
xa x—Aa 


More examples: 


co Example ES | 


Consider the function 


Find the one-sided limits of this function as x —> 71. 


Probably the easiest way to do this is to first plot the graph of sin(x) and 1/x and then 
think carefully about the one-sided limits: 
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e As x — x from the left, sin(x) is a small positive number that is getting closer and 
closer to zero. That is, as x > 71, we have that sin(x) — 0 through positive numbers 
(i.e. from above). Now look at the graph of 1/x, and think what happens as we move 
x — 0*, the function is positive and becomes larger and larger. 


So as x — 7 from the left, sin(x) — 0 from above, and so 1/ sin(x) — +00. 

e By very similar reasoning, as x — 7 from the right, sin(x) is a small negative number 
that gets closer and closer to zero. So as x — 7 from the right, sin(x) — 0 through 
negative numbers (i.e. from below) and so 1/ sin(x) to —o. 

Thus 


: 1 
lim — 
xt sin(x) 


= —00 


Example 13.12} J 


Again, we can make Definitions 1.3.10 and 1.3.11 into mathematically precise formal 
definitions using techniques very similar to those in the optional Section 1.7. This is not 
strictly necessary for this course. = 

Up to this point we explored limits by sketching graphs or plugging values into a cal- 
culator. This was done to help build intuition, but it is not really the basis of a systematic 
method for computing limits. We have also avoided more formal approaches” since we 
do not have time in the course to go into that level of detail and (arguably) we don’t need 
that detail to achieve the aims of the course. Thankfully we can develop a more system- 
atic approach based on the idea of building up complicated limits from simpler one by 
examining how limits interact with the basic operations of arithmetic. 


1.4 4 Calculating limits with limit laws 


Think back to the functions you know and the sorts of things you have been asked to 
draw, factor and so on. Then they are all constructed from simple pieces, such as 


12 The formal approaches are typically referred to as “epsilon-delta limits” or “epsilon-delta proofs” since d 


the symbols € and 6 are traditionally used throughout. Take a peek at Section 1.7 to see. 
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e constants — c 
n 
e monomials — x 


e trigonometric functions — sin(x),cos(x) and tan(x) 


These are the building blocks from which we construct functions. Soon we will add a 
few more functions to this list, especially the exponential function and various inverse 
functions. 

We then take these building blocks and piece them together using arithmetic 


e addition and subtraction — f(x) = g(x) +h(x) and f(x) = g(x) —h(x) 
e multiplication — f(x) = g(x) - h(x) 


e division — f(x) = es 


e substitution — f(x) = g(h(x)) — this is also called the composition of g with h. 


The idea of building up complicated functions from simpler pieces was discussed in Sec- 
tion 0.5. 

What we will learn in this section is how to compute the limits of the basic building 
blocks and then how we can compute limits of sums, products and so forth using “limit 
laws”. This process allows us to compute limits of complicated functions, using very 
simple tools and without having to resort to “plugging in numbers” or “closer and closer” 
or “e — 6 arguments”. 

In the examples we saw above, almost all the interesting limits happened at points 
where the underlying function was badly behaved — where it jumped, was not defined 
or blew up to infinity. In those cases we had to be careful and think about what was 
happening. Thankfully most functions we will see do not have too many points at which 
these sorts of things happen. 

For example, polynomials do not have any nasty jumps and are defined everywhere 
and do not “blow up”. If you plot them, they look smooth’. Polynomials and limits 
behave very nicely together, and for any polynomial P(x) and any real number a we have 
that 

lim P(x) = P(a) 


x—Aa 


That is — to evaluate the limit we just plug in the number. We will build up to this result 
over the next few pages. 
Let us start with the two easiest limits!* 


Theorem 1.4.1 (Easiest limits). 


Let a,c € R. The following two limits hold 


limc=c and 
x—a 


13 We have used this term in an imprecise way, but it does have a precise mathematical meaning. d 


14 Though it lies outside the scope of the course, you can find the formal e-6 proof of this result at the end 
of Section 1.7. 


48 


LIMITS 1.4 CALCULATING LIMITS WITH LIMIT LAWS 


Since we have not seen too many theorems yet, let us examine it carefully piece by 
piece. 


e Let a,c ¢ IR — just as was the case for definitions, we start a theorem by defining 
terms and setting the scene. There is not too much scene to set: the symbols a and c 
are real numbers. 


e The following two limits hold — this doesn’t really contribute much to the state- 
ment of the theorem, it just makes it easier to read. 


e limc = c — when we take the limit of a constant function (for example think of 
xa 


c = 3), the limit is (unsurprisingly) just that same constant. 


e lim x = a — as we noted above for general polynomials, the limit of the function 
xa 


f(x) = x as x approaches a given point a, is just a. This says something quite obvious 
—as x approaches a, x approaches a (if you are not convinced then sketch the graph). 


Armed with only these two limits, we cannot do very much. But combining these 
limits with some arithmetic we can do quite a lot. For a moment, take a step back from 
limits for a moment and think about how we construct functions. To make the discussion 
a little more precise think about how we might construct the function 


_) BH 3 
— x2 +5x —6 


h(x) 
If we want to compute the value of the function at x = 2, then we would 
e compute the numerator at x = 2 
e compute the denominator at x = 2 
e compute the ratio 
Now to compute the numerator we 
e take x and multiply it by 2 
e subtract 3 to the result 
While for the denominator 
e multiply x by x 
e multiply x by 5 
e add these two numbers and subtract 6 


This sequence of operations can be represented pictorially as the tree shown in Figure 1.4.1 
below. ae 
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Figure 1.4.1. 


Such trees were discussed in Section 0.5 (now is not a bad time to quickly review that 
section before proceeding). The point here is that in order to compute the value of the 
function we just repeatedly add, subtract, multiply and divide constants and x. 


To compute the limit of the above function at x = 2 we can do something very similar. 
From the previous theorem we know how to compute 


limc=c and lim x = 2 
x2 x2 


and the next theorem will tell us how to stitch together these two limits using the arith- 
metic we used to construct the function. 
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Theorem 1.4.2 (Arithmetic of limits). 


Let a,c € R, let f(x) and g(x) be defined for all x’s that lie in some interval about 
a (but f, g need not be defined exactly at a). 


lim: fe) 36 lim g(x) =G 


x—a Ose) 


exist with F,G € R. Then the following limits hold 


e lim(f(x) + ¢(x)) = F + G — limit of the sum is the sum of the limits. 
x—a 


e lim(f(x) — g(x)) = F — G — limit of the difference is the difference of the 
x—a 


limits. 


e lim cf ( Vcr 


See Oe e 
xa g(x) G 


Note — be careful with this last one — the denominator cannot be zero. 


The above theorem shows that limits interact very simply with arithmetic. If you are 
asked to find the limit of a sum then the answer is just the sum of the limits. Similarly the 
limit of a product is just the product of the limits. 

How do we apply the above theorem to the rational function h(x) we defined above? 
Here is a warm-up example: 


Example 1.4.3 


You are given two functions f, g (not explicitly) which have the following limits as x ap- 
proaches 1: 


lim f(x) =3 and lim g(x) =2 


x1 x1 


Using the above theorem we can compute 


lim 3f(x) =3x3=9 

lim 3f(x) — g(x) =3x3-2=7 
lim f(x)g(x) =3x2=6 
fe) 3 _, 


rt Fa)—gs@) 3-2 


Campi 143}-J 
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Another simple example 
(= Example 1.4.4 ———————q 
Find lim Ax? —1 


x 
We use the arithmetic of limits: 


lim 4x2 —1 = (tim 1x?) —lim1 difference of limits 
x3 x3 x3 
= (tim 4- lim *) —lim1 product of limits 
x3 x3 x 
=4. (rim *) —1 limit of constant 
x 
= 4. (tim *) (tim *) = product of limits 
x3 x3 
=4-.3-3-1 limit of x 
= 36-1 


tC Example 1.4.4 __J 


This is an excruciating level of detail, but when you first use this theorem and try some 
examples it is a good idea to do things step by step by step until you are comfortable with 
it. 


(om Example 1.4.5 +... "<<< —q 
x 


Yet another limit — compute lim : 
x a 


To apply the arithmetic of limits, we need to examine numerator and denominator 
separately and make sure the limit of the denominator is non-zero. Numerator first: 


lim x = 2 limit of x 
x2 


and now the denominator: 


limx—1= (tim *) _ (tim 1) difference of limits 
x2 x x2 
=2— 1 limit of x and limit of constant = 1 


Since the limit of the denominator is non-zero we can put it back together to get 


lim x 
lim eee 
x>2x—-1 lim(x-1) 
x2 
_ 2 
wae 
="? 
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Cape 1 45} 


In the next example we show that many different things can happen if the limit of the 
denominator is zero. 


Example 1.4.6 (Be careful with limits of ratios) 


We must be careful when computing the limit of a ratio — it is the ratio of the limits ex- 
cept when the limit of the denominator is zero. When the limit of the denominator is zero 
Theorem 1.4.2 does not apply and a few interesting things can happen 


e If the limit of the numerator is non-zero then the limit of the ratio does not exist 
lim F(x) _ DNE when lim f(x) 4 0 and lim g(x) =0 
x—-a x—a 
. 1 
For example, lim z= DNE. 
x0 X 


e If the limit of the numerator is zero then the above theorem does not give us enough 
information to decide whether or not the limit exists. It is possible that 


1 
— the limit does not exist, eg. lim - = lim— = DNE 
—0 Xx x0 X 
x? — x2 = 
— the limit is +00, eg. lim = z= = lim = = +0 or lim = = = lim 7 = =e, 
x0 X 0X -0 Xx x0 X 
x2 
- the limit is zero, eg. lim — = 0 
x0 X 


x 
— the limit exists and is non-zero, eg. lim —-=1 
x0 X 


Now while the above examples are very simple and a little contrived they serve to illus- 
trate the point we are trying to make — be careful if the limit of the denominator is zero. 


tC Example 1.4.6 __J 


We now have enough theory to return to our rational function and compute its limit 
as X approaches 2. 


Example 1.4.7 a | 
ZL =3 


Let h(x) = 2 — —} and find its limit as x approaches 2. 
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Since this is the limit of a ratio, we compute the limit of the numerator and denomina- 
tor separately. Numerator first: 


lim 2x —3 = (tim 2x) _ (iim 3) difference of limits 
x—2 x2 x2 
=2. (tim *) —3 product of limits and limit of constant 
x 
=2.2-3 limits of x 
=1 
Denominator next: 
lim x7 +5x—6= (tim 2°) + (tim 5x) _ (iim 6) sum of limits 
x2 x—> x—> xX 
- (tim *) (tim *) +5. (tim *) —6 product of limits and limit of constant 
x x x 
=2.-2+5-2-6 limits of x 
= 8 


Since the limit of the denominator is non-zero, we can obtain our result by taking the ratio 
of the separate limits. 


2x3 oe ae 


yD x2 4+5X—6 lim x? + 5x —6 ~ 8 
x2 


The above works out quite simply. However, if we were to take the limit as x — 1 then 
things are a bit harder. The limit of the numerator is: 


lim 2x —-3 =2-1-3=-1 
x1 


(we have not listed all the steps). And the limit of the denominator is 
lim x? + 5x —6 = 1-1+5-6=0 
x— 
Since the limit of the numerator is non-zero, while the limit of the denominator is zero, 


the limit of the ratio does not exist. 


2x —3 
im —-— 
x2 x2 + 5x —6 


tC Example 1.4.7 =) 


Itis IMPORTANT TO NOTE that it is not correct to write 


= DNE 


2x —3 —1 


li 223 NE 
io So. © 
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Because we can only write 


in £@ fim f(x) — 
im =a = sometnin 
xa g(x) — lim g(x) : 


when the limit of the denominator is non-zero (see Example 1.4.6 above). 
With a little care you can use the arithmetic of limits to obtain the following rules for 
limits of powers of functions and limits of roots of functions: 


Theorem 1.4.8 (More arithmetic of limits — powers and roots). 


Let n be a positive integer, let a € IR and let f be a function so that 


lim f(a) =F 


x—Aa 


for some real number F. Then the following holds 


n 
lim (f(x))" = (lim f(x) =F" 
so that the limit of a power is the power of the limit. Similarly, if 
e nis aneven number and F > 0, or 
e nis and odd number and F is any real number 


then 


Notice that we have to be careful when taking roots of limits that might be negative 
numbers. To see why, consider the case n = 2, the limit 


ling *={4'" 29 
x4 


lim (—x)'"? — (—4)!/2 = nota real number 

x 

In order to evaluate such limits properly we need to use complex numbers which are 
beyond the scope of this text. 

Also note that the notation x!/* refers to the positive square root of x. While 2 and (—2) 
are both square-roots of 4, the notation 4'/? means 2. This is something we must be careful 
of 5. 

So again — let us do a few examples and carefully note what we are doing. 


15 Like ending sentences in prepositions — “This is something up with which we will not put.” This quote 
is attributed to Churchill though there is some dispute as to whether or not he really said it. 
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im Example Se 


1/3 
lim (4x? — 3)1/3 = (im 4x7) — (im 3) ) 


x x2 x 
= (4.2-3)"" 
= (16-3)' 
Ser” 


Cape 1 49} 


By combining the last few theorems we can make the evaluation of limits of polyno- 
mials and rational functions much easier: 


Theorem 1.4.10 (Limits of polynomials and rational functions). 


Let a € R, let P(x) be a polynomial and let R(x) be a rational function. Then 


lim P(x) = P(a) 


x—Aa 


and provided R(x) is defined at x = a then 


lim R(x) = R(a) 


x—a 


If R(x) is not defined at x = a then we are not able to apply this result. 


So the previous examples are now much easier to compute: 


kim 2x —3 _ 4—3 _ 1 
x-2x2+5x-6 4410-6 } 8 
lim (4x° — 1) = 16-1 = 15 
x— 
2 
li =- w = 2 
ae | 2-1 


It is clear that limits of polynomials are very easy, while those of rational functions are 
easy except when the denominator might go to zero. We have seen examples where the 
resulting limit does not exist, and some where it does. We now work to explain this more 
systematically. The following example demonstrates that it is sometimes possible to take 
the limit of a rational function to a point at which the denominator is zero. Indeed we 
must be able to do exactly this in order to be able to define derivatives in the next chapter. 


56 


LIMITS 1.4 CALCULATING LIMITS WITH LIMIT LAWS 


im Example 1.4.1 o———— ee oF 


Consider the limit 


x? — x2 


x1 x-1 . 


If we try to apply the arithmetic of limits then we compute the limits of the numerator 
and denominator separately 


lim x3 — x? =1-1=0 (1.4.1) 
x— 
limx-1=1-1=0 (1.4.2) 
x— 


Since the denominator is zero, we cannot apply our theorem and we are, for the moment, 
stuck. However, there is more that we can do here — the hint is that the numerator and 
denominator both approach zero as x approaches 1. This means that there might be some- 
thing we can cancel. 

So let us play with the expression a little more before we take the limit: 


3 aed 2 = | 
let, a om © } 9p 


x-1 x-1 


provided x 4 1. 


So what we really have here is the following function 


undefined x=1 


ae an a x1 
x-1 


If we plot the above function the graph looks exactly the same as y = x? except that the 
function is not defined at x = 1 (since at x = 1 both numerator and denominator are zero). 


When we compute a limit as x — a, the value of the function exactly at x = a is irrelevant. 
We only care what happens to the function as we bring x very close to a. So for the above 
problem we can write 

3 2 


x? — Xx ; 
= x? when x is close to 1 but not at x = 1 
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So the limit as x — 1 of the function is the same as the limit lim x? since the functions are 


xa 


the same except exactly at x = 1. By this reasoning we get 


A ___ Example 1.4.11 SI 


The reasoning in the above example can be made more general: 


Theorem 1.4.12. 


If f(x) = g(x) except when x = a then lim f(x) = lim g(x) provided the limit of 


g exists. 


How do we know when to use this theorem? The big clue is that when we try to 
compute the limit in a naive way, we end up with e. We know that ' does not make 
sense, but it is an indication that there might be a common factor between numerator 
and denominator that can be cancelled. In the previous example, this common factor was 
(x—1). 


im Example 1813 $$ 


Using this idea compute 


a 
lim (1+h)*-1 
h—0 h 


e First we should check that we cannot just substitute h = 0 into this — clearly we 
cannot because the denominator would be 0. 


e But we should also check the numerator to see if we have , and we see that the 
numerator gives us 1 —1 = 0. 


e Thus we have a hint that there is a common factor that we might be able to cancel. 
So now we look for the common factor and try to cancel it. 


(L+h)?-1  14+2h+h?-1 


7 ; expand 
2 
= 2h - h = nee h) factor and then cancel 
=21+h 


e Thus we really have that 


(l+h)?-1  J2+h h#0 
h ~ )undefined h=0 
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and because of this 


Of course — we have written everything out in great detail here and that is way more than 
is required for a solution to such a problem. Let us do it again a little more succinctly. 


co Example | 


Compute the following limit: 


1+h)?-1 
lim (+h) -1 
h—0 h 
If we try to use the arithmetic of limits, then we see that the limit of the numerator and 


the limit of the denominator are both zero. Hence we should try to factor them and cancel 
any common factor. This gives 


_ (1+h)*-1  ,. 142h+h-1 
i 
h—0 h h—0 h 
=lim2+h 
h-0 
= 2. 


tC Example 1.4.14 __J 


Notice that even though we did this example carefully above, we have still written some 
text in our working explaining what we have done. You should always think about the 
reader and if in doubt, put in more explanation rather than less. We could make the above 
example even more terse 


Cc Example SS ———————=—s<— 


Compute the following limit: 


2_ 
fin (LEH =1 
h-0 h 


Numerator and denominator both go to zero as h — 0. So factor and simplify: 


_ (1+h)*-1  .. 142h+h?-1 
lim + = lim 
h-0 h h—0 h 
=lim2+h=2 
h—0 


er Example 1.4.15 J 


ey 
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A slightly harder one now 


im Example [0 | 


Compute the limit 


lim =e 
20 4/1-+-x-1 


If we try to use the arithmetic of limits we get 


lim x = 0 


x 


lim vl + -1= flim1+x—-1=1-1=0 
x— x 


So doing the naive thing we’d get 0/0. This suggests a common factor that can be can- 
celled. Since the numerator and denominator are not polynomials we have to try other 
tricks'® . We can simplify the denominator 1+ x —1 a lot, and in particular eliminate 
the square root, by multiplying it by its conjugate VI +x +1. 


x 7 se : V1+x+1 multiply by Comusate _ 
gree—1 ale —1” V/1T+x Py ”Y conjugate — 
eal ' 
= wees 1) ea bring things together 
1 1 
= 2 co a ) since (a—b)(a+b) =a — 0b’ 
V1+x) -1-1 
V1 1 
= a ) clean up a little 
_ x(V1+x+1) 
zx 
=vVl+x-+1 cancel the x 
So now we have 
x et 
= 1+ 1 
90 a/1 tx—1 lim Vv ale 


16 While these tricks are useful (and even cute!’), Taylor polynomials (see Section 3.4) give us a more , 


systematic way of approaching this problem. 

17 Mathematicians tend to have quite strong opinions on the beauty of mathematics. For example, Paul 
Erdés!®said “Why are numbers beautiful? It’s like asking why is Beethoven’s Ninth Symphony beauti- 
ful. If you don’t see why, someone can’t tell you. I know numbers are beautiful. If they aren’t beautiful, 
nothing is.”. 

18 Arguably the most prolific mathematician of the 20th century — definitely worth a google. The authors 
do not know his opinion on nested footnotes!’. 

19 Nested footnotes are generally frowned upon, since they can get quite contorted; see XKCD-1208 and 
also the novel “House of Leaves” by Mark Z. Danielewski. 
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________EesEs=$=$§=$=<+~CL_L__£8§4_£=_z Example 1.4.16 =) 


How did we know what to multiply by? Our function was of the form 


ae 
/b—c 


so, to eliminate the square root from the denominator, we employ a trick — we multiply 
by 1. Of course, multiplying by 1 doesn’t do anything. But if you multiply by 1 carefully 
you can leave the value the same, but change the form of the expression. More precisely 


a (ve a c) 
= expand denominator carefully 


(vb-c) (vb+c) 
a(vb+c) 
~ Vb-Vb—cVb +eVb —c-¢ 
_a(vb+e) 


b—c2 


do some cancellation 


Now the numerator contains roots, but the denominator is just a polynomial. 

Before we move on to limits at infinity, there is one more theorem to see. While the 
scope of its application is quite limited, it can be extremely useful. It is called a sandwich 
theorem or a squeeze theorem for reasons that will become apparent. 

Sometimes one is presented with an unpleasant ugly function such as 


F(a) Sx" sin x) 


It is a fact of life, that not all the functions that are encountered in mathematics will be 
elegant and simple; this is especially true when the mathematics gets applied to real world 
problems. One just has to work with what one gets. So how can we compute 


lim x? sin(7t/x)? 


x2 


Since it is the product of two functions, we might try 


lim x? sin(7t/x) = (tim *) (tim sin(r7/x) ) 


x—> x0 x 
= 0: (tim sin(r/x) 
x—0 
= 0 


61 


LIMITS 1.4 CALCULATING LIMITS WITH LIMIT LAWS 


But we just cheated — we cannot use the arithmetic of limits theorem here, because the 
limit 


lim sin(7c/.x) =DNE 
x—' 


does not exist. Now we did see the function sin(7t/x) before (in Example 1.3.5), so you 
should go back and look at it again. Unfortunately the theorem “the limit of a product is 
the product of the limits” only holds when the limits you are trying to multiply together 
actually exist. So we cannot use it. 

However, we do see that the function naturally decomposes into the product of two 
pieces — the functions x? and sin(7t/x). We have sketched the two functions in the figure 
on the left below. 


Figure 1.4.2. 


y = 2? sin(r/zx) 


While x? is a very well behaved function and we know quite a lot about it, the function 
sin(7t/x) is quite ugly. One of the few things we can say about it is the following 


—1 <sin(7/x) <1 provided x 4 0 


But if we multiply this expression by x* we get (because x? > 0) 


2 2 


=? <a sin( a/ x) < x provided x #0 
and we have sketched the result in the figure above (on the right). So the function we are 
interested in is squeezed or sandwiched between the functions x? and —x?. 

If we focus in on the picture close to x = 0 we see that x approaches 0, the functions 
x? and —x? both approach 0. Further, because x? sin(7t/x) is sandwiched between them, 
it seems that it also approaches 0. 


The following theorem tells us that this is indeed the case: 
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Theorem 1.4.17 (Squeeze theorem (or sandwich theorem or pinch theorem)). 


Let a € R and let f, g,h be three functions so that 


f(x) < g(x) < A(x) 


for all x in an interval around a, except possibly exactly at x = a. Then if 


lim f(x) = lim h(x) al 


x—Aa 


then it is also the case that 


Using the above theorem we can compute the limit we want and write it up nicely 


cl Example DS 


Compute the limit 


lim x? sin(7t/x) 
x0 
Since —1 < sin(@) < 1 for all real numbers 6, we have 
—1 <sin(7/x) <1 for all x #0 


Multiplying the above by x* we see that 


2 


9 <x sinl 7/ x) for all x #0 


Since lim x? = lim (—x") = 0 by the sandwich (or squeeze or pinch) theorem we have 


x x— 


lim x? sin(7/x) = 0 
x0 


Campi 1.4.18 =) 


Notice how we have used “words”. We have remarked on this several times already in 
the text, but we will keep mentioning it. It is okay to use words in your answers to maths 
problems — and you should do so! These let the reader know what you are doing and 
help you understand what you are doing. 

Another sandwich theorem example 


Example 1.4.19 


Let f(x) be a function such that 1 < f(x) < x* —2x +2. What is lim facak 
x= 


We are already supplied with an inequality, so it is likely that it is going to help us. We 
should examine the limits of each side to see if they are the same: 


lim1=1 


x1 


lim x? —2x-++2=1-24+2=1 
x 
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So we see that the function f(x) is trapped between two functions that both approach 1 as 
x — 1. Hence by the sandwich / pinch / squeeze theorem, we know that 


lim f(x) =1 


x1 


tC Example 1419} 


1.5 4 Limits at infinity 


Up until this point we have discussed what happens to a function as we move its input x 
closer and closer to a particular point a. For a great many applications of limits we need 
to understand what happens to a function when its input becomes extremely large — for 
example what happens to a population at a time far in the future. 

The definition of a limit at infinity has a similar flavour to the definition of limits at 
finite points that we saw above, but the details are a little different. We also need to 
distinguish between positive and negative infinity. As x becomes very large and positive it 
moves off towards +00 but when it becomes very large and negative it moves off towards 
—o. 

Again we give an informal definition; the full formal definition can be found in (the 
optional) Section 1.8 near the end of this chapter. 


We write 


Be) = 
when the value of the function f(x) gets closer and closer to L as we make x 
larger and larger and positive. 
Similarly we write 


lint f(e)— 2 


06 —2 10,0) 


when the value of the function f(x) gets closer and closer to L as we make x 
larger and larger and negative. 


Example 1.5.2 


Consider the two functions depicted below 
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The dotted horizontal lines indicate the behaviour as x becomes very large. The function 
on the left has limits as x — co and as x — —co since the function “settles down” to a 
particular value. On the other hand, the function on the right does not have a limit as 
x —- —oo since the function just keeps getting bigger and bigger. 


tC Example 1.5.2 _J 


Just as was the case for limits as x — a we will start with two very simple building 
blocks and build other limits from those. 


Theorem 1.5.3. 


Let c € IR then the following limits hold 


Again, these limits interact nicely with standard arithmetic: 
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Theorem 1.5.4 (Arithmetic of limits at infinity). 


Let f(x), (x) be two functions for which the limits 


lim f(2) =F lim g(x) =G 


x—0O x—0CO 


exist. Then the following limits hold 


provided G 4 0 


provided f(x)" is defined for all x 


The analogous results hold for limits to —«. 


Note that, as was the case in Theorem 1.4.8, we need a little extra care with powers of 
functions. We must avoid taking square roots of negative numbers, or indeed any even 
root of a negative number”. 

Hence we have for all rational r > 0 


1 
lim — =0 
x—00 xT 
but we have to be careful with 
ey 
x——-—00 xr 


This is only true if the denominator of r is not an even number”!. 
For example 


; 1 ; 1 : : . 
e lim 2 0, but lim aa does not exist, because x!/2 is not defined for x < 0. 
X00 X xX -0O X /2 


e On the other hand, x*/° is defined for negative values of x and lim 0. 


x—— x4/3 7 


Our first application of limits at infinity will be to examine the behaviour of a rational 
function for very large x. To do this we use a “trick”. 


20 ‘To be more precise, there is no real number x so that x*V" POW is a negative number. Hence we cannot d 


take the even-root of a negative number and express it as a real number. This is precisely what complex 
numbers allow us to do, but alas there is not space in the course for us to explore them. 
21 where we write r = with p,q integers with no common factors. For example, r = ti should be written 


as r = 3 when considering this rule. 
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= Example 0 


Compute the following limit: 


— x2 —3x +4 
x>0 3x2+8x+1 


As x becomes very large, it is the x* term that will dominate in both the numerator and 
denominator and the other bits become irrelevant. That is, for very large x, x? is much 
much larger than x or any constant. So we pull out these dominant parts 


2 3,4 
x7-3x+4  * (1-34 4) 
3x2 4+8x+1 42 (3 8 1) 
x x2 
34 zs 
_ a 3 a remove the common factors 
xT 42 
_ 3x44  . 1-244 
lim Ryde i = li 2.18. 
x>0 3x+ + 8x4 same e fay eo ate 
3 4 
lim (1 —-—+ *) 
X—0CO 
= = arithmetic of limits 
mer 
3 4 
lim 1— lim lim ; 
ee 3 eee A more arithmetic of limits 
pm? + A FS 
Nae Od 
3+0+0 


——— Example 1.5.5 __J 


The following one gets a little harder 


We use the same trick — try to work out what is the biggest term in the numerator and 
denominator and pull it to one side. 


e The denominator is dominated by 5x. 
e The biggest contribution to the numerator comes from the 4x? inside the square- 


root. When we pull x* outside the square-root it becomes x, so the numerator is 
dominated by x - 4 = 2x 
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e To see this more explicitly rewrite the numerator 
4x2 +1 = 4/x2(441/x2) = Vx20/4 41/2? = 2/4 41/22. 


e Thus the limit as x — ~ is 


—  WAx2+1 i xv/4+1/x? 
m ny ¢—— 


x0 5x-1 x50 x(5—1/x) 


\/4+1/x2 


= jim 1/x 
2 
5 
tC Example 1.5.6 _J 


Now let us also think about the limit of the same function, ~¥ ay as x — —oo. There 


is something a subtle going on because of the square-root. First consider the function?” 


h(t) = ve 


Evaluating this at t = 7 gives 
h(7) = V72 = V49 =7 


We'll get much the same thing for any t > 0. For any t > 0, h(t) = vf? returns exactly t. 
However now consider the function at t = —3 


h(-3) = 4/(-3)2 = V9 =3 = —(-3) 


that is the function is returning —1 times the input. 
This is because when we defined ,/, we defined it to be the positive square-root. ie. 


the function /¢ can never return a negative number. So being more careful 
h(t) = V2 = |t| 


Where the |t| is the absolute value of t. You are perhaps used to thinking of absolute value 
as “remove the minus sign”, but this is not quite correct. Lets sketch the function 


Figure 1.5.1. 


22 Just to change things up lets use ¢ and h(t) instead of the ubiquitous x and f (x). d 
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It is a piecewise function defined by 


So that when we evaluate h(—7) it is 


h(-7) = 4/(-7)2 = V9 =7 = -(-7) 


We are now ready to examine the limit as x — —o in our previous example. Mostly it is 
copy and paste from above. 


om Example 1.5.7 = ————j 


Find the limit as x — — of Berl 


We use the same trick — try to work out what is the biggest term in the numerator and 
denominator and pull it to one side. Since we are taking the limit as x —- —oo we should 
think of x as a large negative number. 


e The denominator is dominated by 5x. 


e The biggest contribution to the numerator comes from the 4x? inside the square-root. 
When we pull the x? outside a square-root it becomes |x| = —x (since we are taking 
the limit as x - —c), so the numerator is dominated by —x- 4 = —2x 


e To see this more explicitly rewrite the numerator 

/4x2 +1 = 4/x2(441/x2) = Vx29/4 41/22 
|x|/4 + 1/x2 and since x < 0 we have 
—xV4+1/x? 


e Thus the limit as x — —oo is 


yee 5x—-1 xo x(5—1/x) 


I 
5 


__—_FsFsf“=___L__cgea=E=D Example 1.5.7 =) 


So the limit as x — —co is almost the same but we gain a minus sign. This is definitely 
not the case in general — you have to think about each example separately. 
Here is a sketch of the function in question. 
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Figure 1.5.2. 


i Example a | 


Compute the following limit: 
7/5 _ 
lil) 


In this case we cannot use the arithmetic of limits to write this as 


ie li ae 


= 0 — C€ 


because the limits do not exist. We can only use the limit laws when the limits exist. So 
we should go back and think some more. 

When x is very large, x7/° = x - x*/° will be much larger than x, so the x 
dominate the x term. So factor out x’/° and rewrite it as 


7/5 term will 


Consider what happens to each of the factors as x — co 


e For large x, x’/° 


x becomes arbitrarily large and positive, and x 
that 


> x (this is actually true for any x > 1). In the limit as x > +c, 
7/5 must be bigger still, so it follows 


lim x7/5 = +o. 
X— 00 


e On the other hand, (1 — x~?/>) becomes closer and closer to 1 — we can use the 
arithmetic of limits to write this as 


lim (1=9° 2") = tim 1 dime AH 1 = 0 = 1 
x00 x—-cC xX— 00 


So the product of these two factors will be come larger and larger (and positive) as x 
moves off to infinity. Hence we have 


lim x7/5 (1 7 Va?) asc 


x0 
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tC Example 1.5.8 _J 


But remember +0 and — are not numbers; the last equation in the example is shorthand 
for “the function becomes arbitrarily large”. 


In the previous section we saw that finite limits and arithmetic interact very nicely (see 
Theorems 1.4.2 and 1.4.8). This enabled us to compute the limits of more complicated 
function in terms of simpler ones. When limits of functions go to plus or minus infinity 
we are quite a bit more restricted in what we can deduce. The next theorem states some 
results concerning the sum, difference, ratio and product of infinite limits — unfortunately 
in many cases we cannot make general statements and the results will depend on the 
details of the problem at hand. 
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Theorem 1.5.9 (Arithmetic of infinite limits). 


Let a,c,H € Rand let f,g,h be functions defined in an interval around a (but 
they need not defined at x = a), so that 


lim f(x) = +00 lim g(x) = +00 lim h(x) = H 


x—a x—Aa x—Aa 


lim(f (x) +.¢(x)) =- 


lim(f (x) + h(x)) = +00 


x—a 


lim(f (x) — g(x)) undetermined 


x—a 


lim(f (x) — h(x)) = +00 
-- 00 
lim cf (x) = { 0 


lim (f(x) - 3(x)) 


x—Aa 


H>0 
lim f(x)h(x) = 4 —0 ibleai() 
undetermined H=0O 


f(x) 


lim —— undetermined 


x8 B(x) 


H>0 
H <0 


undetermined H=0O 


Note that by “undetermined” we mean that the limit may or may not exist, but cannot 
be determined from the information given in the theorem. See Example 1.4.6 for an exam- 
ple of what we mean by “undetermined”. Additionally consider the following example. 


Example 1.5.10 


Consider the following 3 functions: 
f{oj=s- a(t) = 20" Ca ae meee 
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Their limits as x — 0 are: 


lim f(x) = +00 lim g(x) = +00 lim h(x) = +0. 

x—0 x0 x0 
Say we want to compute the limit of the difference of two of the above functions as x — 0. 
Then the previous theorem cannot help us. This is not because it is too weak, rather it is 
because the difference of two infinite limits can be, either plus infinity, minus infinity or 
some finite number depending on the details of the problem. For example, 


lim (f(x) — g(x)) = lim —x~* = —o0 


x0 x0 

lim (f(x) —h(x)) =lim1 =1 

x0 x0 

lim (¢(x) — h(x)) = limx-* +1 = +00 
x0 x0 


tC Example 15.10J 


1.6 4 Continuity 


We have seen that computing the limits some functions — polynomials and rational func- 
tions — is very easy because 
lim f(x) = f(a). 

That is, the the limit as x approaches a is just f(a). Roughly speaking, the reason we can 
compute the limit this way is that these functions do not have any abrupt jumps near a. 

Many other functions have this property, sin(x) for example. A function with this 
property is called “continuous” and there is a precise mathematical definition for it. If 
you do not recall interval notation, then now is a good time to take a quick look back at 
Definition 0.3.5. 


A function f(x) is continuous at a if 


If a function is not continuous at a then it is said to be discontinuous at a. 

When we write that f is continuous without specifying a point, then typically 
this means that f is continuous at a for alla e R. 

When we write that f(x) is continuous on the open interval (a,b) then the func- 
tion is continuous at every point c satisfying a < c < b. 


So if a function is continuous at x = a we immediately know that 


e f(a) exists 
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e lim exists and is equal to f(a), and 
xa 


e lim exists and is equal to f (a). 
x—a 


» Quick aside — one-sided continuity 


Notice in the above definition of continuity on an interval (a,b) we have carefully avoided 
saying anything about whether or not the function is continuous at the endpoints of the 
interval — ie. is f(x) continuous at x = a or x = b. This is because talking of continuity 
at the endpoints of an interval can be a little delicate. 
In many situations we will be given a function f(x) defined on a closed interval [a,b]. 
For example, we might have: 
e+ 1 


fs) ay for x € [0,1]. 


For any 0 < x < 1 we know the value of f(x). However for x < 0 or x > 1 we know 
nothing about the function — indeed it has not been defined. 
So now, consider what it means for f(x) to be continuous at x = 0. We need to have 


lim f(x) = f(0), 


x0 


however this implies that the one-sided limits 


lim f(x) = f(0) and lim f(x) = f(0) 

x—0t x07 
Now the first of these one-sided limits involves examining the behaviour of f(x) for x > 0. 
Since this involves looking at points for which f(x) is defined, this is something we can 
do. On the other hand the second one-sided limit requires us to understand the behaviour 
of f(x) for x < 0. This we cannot do because the function hasn’t been defined for x < 0. 

One way around this problem is to generalise the idea of continuity to one-sided con- 

tinuity, just as we generalised limits to get one-sided limits. 


A function f(x) is continuous from the right at a if 


Similarly a function f(x) is continuous from the left at a if 


lim f(x) = f(a) 


xa 


Using the definition of one-sided continuity we can now define what it means for a 
function to be continuous on a closed interval. 
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A function f(x) is continuous on the closed interval [a,b] when 


e f(x) is continuous on (a,b), 
e f(x) is continuous from the right at a, and 
e f(x) is continuous from the left at b. 


Note that the last two condition are equivalent to 


lim f( and 
x—at 


» Back to the main text 


We already know from our work above that polynomials are continuous, and that rational 
functions are continuous at all points in their domains — i.e. where their denominators 
are non-zero. As we did for limits, we will see that continuity interacts “nicely” with 
arithmetic. This will allow us to construct complicated continuous functions from simpler 
continuous building blocks (like polynomials). 

But first, a few examples... 


Example 1.6.4 


Consider the functions drawn below 


x x<1 1/x? x40 ux yy 4] 
—= = h — x-1 
fe) — ret 8) f 5 


Determine where they are continuous and discontinuous: 


e When x < 1 then f(x) is a straight line (and so a polynomial) and so it is continuous 
at every point x < 1. Similarly when x > 1 the function is a straight line and so it is 
continuous at every point x > 1. The only point which might be a discontinuity is at 
x = 1. We see that the one sided limits are different. Hence the limit at x = 1 does 
not exist and so the function is discontinuous at x = 1. 
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But note that that f(x) is continuous from one side — which? 


e The middle case is much like the previous one. When x # 0 the g(x) is a rational 
function and so is continuous everywhere on its domain (which is all reals except 
x = 0). Thus the only point where g(x) might be discontinuous is at x = 0. We see 
that neither of the one-sided limits exist at x = 0, so the limit does not exist at x = 0. 
Hence the function is discontinuous at x = 0. 


e We have seen the function h(x) before. By the same reasoning as above, we know it 
is continuous except at x = 1 which we must check separately. 


By definition of h(x), h(1) = 0. We must compare this to the limit as x > 1. We did 


this before. 
xe—x*  x72(x-1) 
x-1 x-1 
So lim ieee = lim x? = 1 4 h(1). Hence h is discontinuous at x = 1. 


tC Example 1.6.4 __J 


This example illustrates different sorts of discontinuities: 


e The function f(x) has a “jump discontinuity” because the function “jumps” from 
one finite value on the left to another value on the right. 


e The second function, ¢(x), has an “infinite discontinuity” since lim f(x) = +. 


e The third function, h(x), has a “removable discontinuity” because we could make 
the function continuous at that point by redefining the function at that point. ie. 
setting (1) = 1. That is 

cor ye 


new function h(x) = ' fi 
x — 


Showing a function is continuous can be a pain, but just as the limit laws help us 
compute complicated limits in terms of simpler limits, we can use them to show that 
complicated functions are continuous by breaking them into simpler pieces. 


Theorem 1.6.5 (Arithmetic of continuity). 


Let a,c € Rand let f(x) and g(x) be functions that are continuous at a. Then the 
following functions are also continuous at x = a: 


e f(x) + g(x) and f(x) — g(x), 


e cf(x) and f(x)g(x), and 


e oe provided g(a) 40. 
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Above we stated that polynomials and rational functions are continuous (being care- 
ful about domains of rational functions — we must avoid the denominators being zero) 
without making it a formal statement. This is easily fixed... 


Lemma 1.6.6. 


Let c € R. The functions 


f(x) =x 


are continuous everywhere on the real line 


This isn’t quite the result we wanted (that’s a couple of lines below) but it is a small 
result that we can combine with the arithmetic of limits to get the result we want. Such 
small helpful results are called “lemmas” and they will arise more as we go along. 

Now since we can obtain any polynomial and any rational function by carefully adding, 
subtracting, multiplying and dividing the functions f(x) = x and g(x) = c, the above 
lemma combines with the “arithmetic of continuity” theorem to give us the result we 
want: 


Theorem 1.6.7 (Continuity of polynomials and rational functions). 


Every polynomial is continuous everywhere. Similarly every rational function is 
continuous except where its denominator is zero (i.e. on all its domain). 


With some more work this result can be extended to wider families of functions: 


Theorem 1.6.8. 


The following functions are continuous everywhere in their domains 
e polynomials, rational functions 
e roots and powers 


e trig functions and their inverses 


e exponential and the logarithm 


We haven’t encountered inverse trigonometric functions, nor exponential functions 
or logarithms, but we will see them in the next chapter. For the moment, just file the 
information away. 

Using a combination of the above results you can show that many complicated func- 
tions are continuous except at a few points (usually where a denominator is equal to zero). 
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Example 1.6.9 


Where is the function f(x) = weeL continuous? 
We just break things down into pieces and then put them back together keeping track 


of where things might go wrong. 


e The function is a ratio of two pieces — so check if the numerator is continuous, the 
denominator is continuous, and if the denominator might be zero. 


e The numerator is sin(x) which is “continuous on its domain” according to one of the 
above theorems. Its domain is all real numbers”’, so it is continuous everywhere. No 
problems here. 


e The denominator is the sum of 2 and cos(x). Since 2 is a constant it is continuous 
everywhere. Similarly (we just checked things for the previous point) we know that 
cos(x) is continuous everywhere. Hence the denominator is continuous. 


e So we just need to check if the denominator is zero. One of the facts that we should 
know”! is that 


—1<cos(x) <1 
and so by adding 2 we get 
1<2+cos(x) <3 
Thus no matter what value of x, 2+ cos(x) > 1 and so cannot be zero. 


e So the numerator is continuous, the denominator is continuous and nowhere zero, 
so the function is continuous everywhere. 


sin(x) 


x* —5x +6 
Being a little terse we could answer with: 


If the function were changed to much of the same reasoning can be used. 


e Numerator and denominator are continuous. 
e Since x* — 5x + 6 = (x — 2)(x —3) the denominator is zero when x = 2,3. 


e So the function is continuous everywhere except possibly at x = 2,3. In order to 
verify that the function really is discontinuous at those points, it suffices to verify 
that the numerator is non-zero at x = 2,3. Indeed we know that sin(x) is zero only 
when x = nz (for any integer n). Hence sin(2),sin(3) # 0. Thus the numerator 
is non-zero, while the denominator is zero and hence x = 2,3 really are points of 
discontinuity. 


23 Remember that sin and cos are defined on all real numbers, so tan(x) = sin(x)/ cos(x) is continuous d 


everywhere except where cos(x) = 0. This happens when x = 4 + n7t for any integer n. If you cannot 
remember where tan(x) “blows up” or sin(x) = 0 or cos(x) = 0 then you should definitely revise 
trigonometric functions. Come to think of it — just revise them anyway. 

24 If you do not know this fact then you should revise trigonometric functions. See the previous footnote. 
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Note that this example raises a subtle point about checking continuity when numerator 
and denominator are simultaneously zero. There are quite a few possible outcomes in this 
case and we need more sophisticated tools to adequately analyse the behaviour of func- 
tions near such points. We will return to this question later in the text after we have 


developed Taylor expansions (see Section 3.4). 
Example 169} J 


So we know what happens when we add subtract multiply and divide, what about 
when we compose functions? Well - limits and compositions work nicely when things are 
continuous. 


Theorem 1.6.10 (Compositions and continuity). 


If f is continuous at b and lim g(x) = b then lim f(g(x)) = f(b). ie 


x—Aa 


lim f (g(x)) = f (tim g(x) J 


Hence if g is continuous at a and f is continuous at g(a) then the composite 
function (f o ¢)(x) = f(g(x)) is continuous at a. 


So when we compose two continuous functions we get a new continuous function. 
We can put this to use 


Example 1.6.11 


Where are the following functions continuous? 


jie) =—simn G + cos(x) 


Our first step should be to break the functions down into pieces and study them. When 
we put them back together we should be careful of dividing by zero, or falling outside the 
domain. 


e The function f(x) is the composition of sin(x) with x? + cos(x). 

e These pieces, sin(x),x*,cos(x) are continuous everywhere. 

e So the sum x? + cos(x) is continuous everywhere 

e And hence the composition of sin(x) and x* + cos(x) is continuous everywhere. 
The second function is a little trickier. 

e The function g(x) is the composition of ./x with sin(x). 


e ,/x is continuous on its domain x > 0. 
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e sin(x) is continuous everywhere, but it is negative in many places. 
e In order for g(x) to be defined and continuous we must restrict x so that sin(x) > 0. 


e Recall the graph of sin(x): 


Hence sin(x) > 0 when x ¢€ (0, 77] or x € [271,371] or x € [—271, —71] or.... To be more 
precise sin(x) is positive when x € [2n71, (2n + 1)7] for any integer n. 


e Hence g(x) is continuous when x € [2n7z, (2n + 1)7] for any integer n. 


Ee Example 16.11} 


Continuous functions are very nice (mathematically speaking). Functions from the 
“real world” tend to be continuous (though not always). The key aspect that makes them 
nice is the fact that they don’t jump about. 

The absence of such jumps leads to the following theorem which, while it can be quite 
confusing on first glance, actually says something very natural — obvious even. It says, 
roughly speaking, that, as you draw the graph y = f(x) starting at x = a and ending at 
x = b, y changes continuously from y = f(a) to y = f(b), with no jumps, and conse- 
quently y must take every value between f(a) and f(b) at least once. We'll start by just 
giving the precise statement and then we'll explain it in detail. 


Theorem 1.6.12 (Intermediate value theorem (IVT)). 


Let a < b and let f be a function that is continuous at all points a < x < b. If Y 
is any number between f(a) and f(b) then there exists some number c € [a,b] so 


that f(c) = Y. 


Like the e — 6 definition of limits”, we should break this theorem down into pieces. 
Before we do that, keep the following pictures in mind. 


25 The interested student is invited to take a look at the optional Section 1.7 d 
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Figure 1.6.1. 


Now the break-down 


e Let a < band let f be a function that is continuous at all points a < x < b. — This 
is setting the scene. We have a,b with a < b (we can safely assume these to be real 
numbers). Our function must be continuous at all points between a and Db. 


e if Y is any number between f(a) and f(b) — Now we need another number Y and 
the only restriction on it is that it lies between f(a) and f(b). That is, if f(a) < f(b) 
then f(a) < Y < f(b). Orif f(a) > f(b) then f(a) > Y = f(b). So notice that 
Y could be equal to f(a) or f(b) — if we wanted to avoid that possibility, then we 
would normally explicitly say Y 4 f(a), f(b) or we would write that Y is strictly 
between f(a) and f(b). 


e there exists some number c € [a,b] so that f(c) = Y — so if we satisfy all of the 
above conditions, then there has to be some real number c lying between a and b so 
that when we evaluate f(c) itis Y. 


So that breaks down the proof statement by statement, but what does it actually mean? 
e Draw any continuous function you like between a and b — it must be continuous. 


e The function takes the value f(a) at x = a and f(b) at x = b — see the left-hand 
figure above. 


e Now we can pick any Y that lies between f(a) and f(b) — see the middle figure 
above. The IVT” tells us that there must be some x-value that when plugged into 
the function gives us Y. That is, there is some c between a and b so that f(c) = Y. We 
can also interpret this graphically; the IVT tells us that the horizontal straight line 
y = Y must intersect the graph y = f(x) at some point (c,Y) witha <c <b. 


e Notice that the IVT does not tell us how many such c-values there are, just that there 
is at least one of them. See the right-hand figure above. For that particular choice of 
Y there are three different c values so that f(c1) = f(c2) = f(c3) = Y. 


26 Often with big important useful theorems like this one, writing out the full name again and again i 


becomes tedious, so we abbreviate it. Such abbreviations are okay provided the reader knows this is 
what you are doing, so the first time you use an abbreviation you should let the reader know. Much 
like we are doing here in this footnote. 
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This theorem says that if f(x) is a continuous function on all of the interval a < x < b then 
as x moves from a to b, f(x) takes every value between f(a) and f(b) at least once. To put 
this slightly differently, if f were to avoid a value between f(a) and f(b) then f cannot be 
continuous on [a, D}. 

It is not hard to convince yourself that the continuity of f is crucial to the IVT. Without 
it one can quickly construct examples of functions that contradict the theorem. See the 
figure below for a few non-continuous examples: 


Figure 1.6.2. 


In the left-hand example we see that a discontinuous function can “jump” over the 
Y-value we have chosen, so there is no x-value that makes f(x) = Y. The right-hand 
example demonstrates why we need to be be careful with the ends of the interval. In 
particular, a function must be continuous over the whole interval [a, b] including the end- 
points of the interval. If we only required the function to be continuous on (a, b) (so strictly 
between a and b) then the function could “jump” over the Y-value at a or b. 

If you are still confused then here is a “real-world” example 


co Example 1.6.13 


You are climbing the Grouse-grind”’ with a friend — call him Bob. Bob was eager and 
started at 9am. Bob, while very eager, is also very clumsy; he sprained his ankle some- 
where along the path and has stopped moving at 9:21am and is just sitting”® enjoying the 
view. You get there late and start climbing at 10am and being quite fit you get to the top at 
1lam. The IVT implies that at some time between 10am and 11am you meet up with Bob. 

You can translate this situation into the form of the IVT as follows. Let t be time and let 
a = 10am and b = 11am. Let ¢(t) be your distance along the trail. Hence”? g(a) = 0 and 
¢(b) = 2.9km. Since you are a mortal, your position along the trail is a continuous function 
—no helicopters or teleportation or... We have no idea where Bob is sitting, except that 
he is somewhere between ¢(a) and g(b), call this point Y. The IVT guarantees that there 
is some time c between a and b (so between 10am and 11am) with g(c) = Y (and your 

osition will be the same as Bob’s). 


P 
Ce inpie 1.6.13 __J 
27 If you don’t know it then google it. ) 


28 Hopefully he remembered to carry something warm. 
29 Its amazing what facts you can find on wikipedia. 
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Aside from finding Bob sitting by the side of the trail, one of the most important ap- 
plications of the IVT is determining where a function is zero. For quadratics we know (or 
should know) that 


—b + Vb2 — 4ac 
2a 


While the Babylonians could (mostly, but not quite) do the above, the corresponding for- 
mula for solving a cubic is uglier and that for a quartic is uglier still. One of the most 
famous results in mathematics demonstrates that no such formula exists for quintics or 
higher degree polynomials”. 

So even for polynomials we cannot, in general, write down explicit formulae for their 
zeros and have to make do with numerical approximations — i.e. write down the root as 
a decimal expansion to whatever precision we desire. For more complicated functions we 
have no choice — there is no reason that the zeros should be expressible as nice neat little 
formulas. At the same time, finding the zeros of a function: 


f(x) =0 


ax? ++bx+c=0 when x = 


or solving equations of the form?! 


(x) = h(x) 


can be a crucial step in many mathematical proofs and applications. 

For this reason there is a considerable body of mathematics which focuses just on find- 
ing the zeros of functions. The IVT provides a very simple way to “locate” the zeros of a 
function. In particular, if we know a continuous function is negative at a point x = a and 
positive at another point x = b, then there must (by the IVT) be a point x = c between a 
and b where f(c) = 0. 


Figure 1.6.3. 


30 The similar (but uglier) formula for solving cubics took until the 15th century and the work of del Ferro \ 


and Cardano (and Cardano’s student Ferrari). A similar (but even uglier) formula for quartics was 
also found by Ferrari. The extremely famous Abel-Ruffini Theorem (nearly by Ruffini in the late 18th 
century and completely by Abel in early 19th century) demonstrates that a similar formula for the 
zeros of a quintic does not exist. Note that the theorem does not say that quintics do not have zeros; 
rather it says that the zeros cannot in general be expressed using a finite combination of addition, 
multiplication, division, powers and roots. The interested student should also look up Evariste Galois 
and his contributions to this area. 

31 In fact both of these are the same because we can write f(x) = g(x) — h(x) and then the zeros of f(x) 
are exactly when g(x) = h(x). 
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Consider the leftmost of the above figures. It depicts a continuous function that is 
negative at x = a and positive at x = b. So choose Y = 0 and apply the IVT — there must 
be some c witha <c < bso that f(c) = Y = 0. While this doesn’t tell us c exactly, it does 
give us bounds on the possible positions of at least one zero — there must be at least one 
cobeyinga<c<b. 

See middle figure. To get better bounds we could test a point half-way between a and 
b. So set a’ = **". In this example we see that f(a’) is negative. Applying the IVT again 
tells us there is some c between a’ and b so that f(c) = 0. Again — we don’t have c exactly, 
but we have halved the range of values it could take. 

Look at the rightmost figure and do it again — test the point half-way between a’ and 
b. In this example we see that f(b’) is positive. Applying the IVT tells us that there is 
some c between a’ and b’ so that f(c) = 0. This new range is a quarter of the length of the 
original. If we keep doing this process the range will halve each time until we know that 
the zero is inside some tiny range of possible values. This process is called the bisection 
method. 

Consider the following zero-finding example 


Example 1.6.14 


Show that the function f(x) = x —1+sin(7x/2) hasa zeroin0 <x <1. 

This question has been set up nicely to lead us towards using the IVT; we are already 
given a nice interval on which to look. In general we might have to test a few points and 
experiment a bit with a calculator before we can start narrowing down a range. 

Let us start by testing the endpoints of the interval we are given 


f(0) =0—1+4+sin(0) =-1 <0 
1—1+4sin(2/2) =1>0 


So we know a point where f is positive and one where it is negative. So by the IVT there 
is a point in between where it is zero. 

BUT in order to apply the IVT we have to show that the function is continuous, and 
we cannot simply write 


it is continuous 


We need to explain to the reader why it is continuous. That is — we have to prove it. 
So to write up our answer we can put something like the following — keeping in mind 
we need to tell the reader what we are doing so they can follow along easily. 


e We will use the IVT to prove that there is a zero in [0, 1]. 
e First we must show that the function is continuous. 


— Since x — 1 is a polynomial it is continuous everywhere. 


- The function sin(7tx/2) is a trigonometric function and is also continuous ev- 
erywhere. 


- The sum of two continuous functions is also continuous, so f(x) is continuous 
everywhere. 
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e Leta =0,b = 1, then 


f(0) =0-1+sin(0) =-1<0 
f(1) =1-1+4sin(7/2) =1>0 


e The function is negative at x = 0 and positive at x = 1. Since the function is contin- 
uous we know there is a point c € [0,1] so that f(c) = 0. 


Notice that though we have not used full sentences in our explanation here, we are still 
using words. Your mathematics, unless it is very straight-forward computation, should 
contain words as well as symbols. 


II —————— Example 1.6.14 __J 


The zero is actually located at about x = 0.4053883559. 

The bisection method is really just the idea that we can keep repeating the above rea- 
soning (with a calculator handy). Each iteration will tell us the location of the zero more 
precisely. The following example illustrates this. 


Example 1.6.15 


Use the bisection method to find a zero of 
f(x) =x-—1+sin(7x/2) 


that lies between 0 and 1. 
So we start with the two points we worked out above: 


e a=0,b=1and 


e Test the point in the middle x = 94! = 0.5 


2 
f (0.5) = 0.2071067813 > 0 


e So our new interval will be [0, 0.5] since the function is negative at x = 0 and positive 
atx = 0.5 


Repeat 
e a=0,b =0.5 where f(0) < Oand f(0.5) > 0. 
e Test the point in the middle x = a2 = 0:25 


f (0.25) = —0.3673165675 < 0 


e So our new interval will be [0.25,0.5] since the function is negative at x = 0.25 and 
positive at x = 0.5 


Repeat 
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e a =0.25,b = 0.5 where f (0.25) < 0 and f (0.5) > 0. 


e Test the point in the middle x = eno 


= 0.575 


f (0.375) = —0.0694297669 < 0 


1.6 CONTINUITY 


e So our new interval will be [0.375, 0.5] since the function is negative at x = 0.375 and 


positive at x = 0.5 


Below is an illustration of what we have observed so far together with a plot of the actual 


function. 


e 
0.25 


O 0 


e 
0.25 0.375 0.5 


And one final iteration: 


x —1+sin(72/2) 


e a = 0.375, b = 0.5 where f (0.375) < 0 and f(0.5) > 0. 
e Test the point in the middle x = 983+0° — 0.4375 
f (0.4375) = 0.0718932843 > 0 


e So our new interval will be [0.375, 0.4375] since the function is negative at x = 0.375 


and positive at x = 0.4375 


So without much work we know the location of a zero inside a range of length 0.0625 = 
2-4. Each iteration will halve the length of the range and we keep going until we reach 
the precision we need, though it is much easier to program a computer to do it. 
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Example 1.6.15 


LIMITS 1.7 (OPTIONAL) — MAKING THE INFORMAL A LITTLE MORE FORMAL 


1.7 4 (optional) — Making the informal a little more formal 


As we noted above, the definition of limits that we have been working with was quite 
informal and not mathematically rigorous. In this (optional) section we will work to un- 
derstand the rigorous definition of limits. 

Here is the formal definition — we will work through it all very slowly and carefully 
afterwards, so do not panic. 


Let a € Rand let f(x) be a function defined everywhere in a neighbourhood of 
a, except possibly at a. We say that 


the limit as x approaches a of f(x) is L 


or equivalently 
as x approaches a, f(x) approaches L 


and write 
lone) — 
if and only if for every € > 0 there exists 6 > 0 so that 
|f(x) — L] < e whenever 0 < |x-—a| <6 
Note that an equivalent way of writing this very last statement is 


if 0 < |x —a| <6 then |f(x) —L| <e. 


This is quite a lot to take in, so let us break it down into pieces. 


Usually a definition can be broken down into three pieces. 


e Scene setting — define symbols and any restrictions on the objects that we 
are talking about. 


e Naming — state the name and any notation for the property or object that 
the definition is about. 


e Properties and restrictions — this is the heart of the definition where we 
explain to the reader what it is that the object (in our case a function) has to 
do in order to satisfy the definition. 


Let us go back to the definition and look at each of these pieces in turn. 
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e Setting things up — The first sentence of the definition is really just setting up the 
picture. It is telling us what the definition is about and sorting out a few technical 
details. 


— Let a ¢ R — This simply tells us that the symbol “a” is a real number”. 


- Let f(x) be a function — This is just setting the scene so that we understand all 
of the terms and symbols. 


— defined everywhere in a neighbourhood of a, except possibly at a — This 
is just a technical requirement; we need our function to be defined in a little 
region®’ around a. The function doesn’t have to be defined everywhere, but it 
must be defined for all x-values a little less than a and a little more than a. The 
definition does not care about what the function does outside this little window, 
nor does it care what happens exactly at a. 


e Names, phrases and notation — The next part of the definition is simply naming the 
property we are discussing and tells us how to write it down. ie. we are talking 
about “limits” and we write them down using the symbols indicated. 


e The heart of things — we explain this at length below, but for now we will give a 
quick explanation. Work on these two points. They are hard. 


— for all e > 0 there exists 6 > 0 — It is important we read this in order. It 
means that we can pick any positive number € we want and there will always 
be another positive number 6 that is going to make what ever follows be true. 


— if 0 < |x—a| < 6 then |f(x) — L] < e — From the previous point we have our 
two numbers — any € > 0 then based on that choice of € we have a positive 
number 6. The current statement says that whenever we have chosen x so that 
it is very close to a, then f(x) has to be very close to L. How close it “very 
close”? Well 0 < |x —a| < 6 means that x has to be within a distance 6 of a (but 
not exactly a) and similarly | f(x) — L| < € means that f(x) has to be within a 
distance e of L. 


That is the definition broken up into pieces which hopefully now make more sense, but 
what does it actually mean? Consider a function we saw earlier 


fey= io 383 


and sketch it again: 


32 The symbol “e” is read as “is an element of” — it is definitely not the same as ¢ or € or €. If you do not d 


recognise “IR” or understand the difference between R and R, then please go back and read Chapter 0 
carefully. = 

33 The term “neighbourhood of a” means a small open interval around a — for example (a — 0.01,4+0.01). 
Typically we don’t really care how big this little interval is. 
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Figure 1.7.1. 


We know (from our earlier work) that lim,_.3 f(x) = 6, so zoom in around (x,y) = 
(3,6). To make this look more like our definition, we have a = 3 and L = 6. 


e Pick some small number e€ > 0 and highlight the horizontal strip of all points (x, y) 
for which |y — L| < e. This means all the y-values have to satisfy L-e<y<L+e. 


e You can see that the graph of the function passes through this strip for some x-values 
close to a. What we need to be able to do is to pick a vertical strip of x-values around 
a so that the function lies inside the horizontal strip. 


e That is, we must find a small number 6 > 0 so that for any x-value inside the vertical 
strip a—d <x <a-+d, except exactly at x = a, the value of the function lies inside the 
horizontal strip, namely L—e€ < y = f(x) <L+e. 


e We see (pictorially) that we can do this. If we were to choose a smaller value of € 
making the horizontal strip narrower, it is clear that we can choose the vertical strip 
to be narrower. Indeed, it doesn’t matter how small we make the horizontal strip, 
we will always be able to construct the second vertical strip. 


The above is a pictorial argument, but we can quite easily make it into a mathematical 
one. We want to show the limit is 6. That means for any € we need to find a 6 so that when 


3-—d<x<3+dwithx 43 we have 6—e€< f(x) <6+e 
Now we note that when x # 3, we have f(x) = 2x and so 
6—e< f(x) <6+e implies that 6—-e€<2x <6+e 
this nearly specifies a range of x values, we just need to divide by 2 
O=/ 25 eee eZ 
Hence if we choose 6 = €/2 then we get the desired inequality 


3-0<x<34+6 
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i.e. — no matter what € > 0 is chosen, if we put 6 = e/2 then when 3— 6 < x <3+6 
with x 4 3 we will have 6 — e < f(x) <6+€. This is exactly what we need to satisfy the 
definition of “limit” above. 

The above work gives us the argument we need, but it still need to be written up 
properly. We do this below. 


Example 1.7.3 


Find the limit as x — 3 of the following function 


fad= 45" 223 


Proof. We will show that the limit is equal to 6. Let e > 0 and 6 = e/2. It remains to show 
that | f(x) — 6| < e whenever |x — 3| < 0. 


So assume that |x — 3| < 6, and so 


S020 8 20 multiply both sides by 2 
6—26 < 2x <6+20 


Recall that f(x) = 2x and that since 6 = e/2 


6—e€< f(x) <6+¢e. 


We can conclude that | f(x) — 6| < € as required. 


————— Example 1.7.3 | 


Because of the € and 6 in the definition of limits, we need to have € and 6 in the proof. 
While € and 6 are just symbols playing particular roles, and could be replaced with other 
symbols, this style of proof is usually called e—-6 proof. 


In the above example everything works, but it can be very instructive to see what 
happens in an example that doesn’t work. 


Example 1.7.4 


Look again at the function 


x x<2 
f(x) =4-1 x= 2 
x+3 x>2 


and let us see why, according to the definition of the limit, that lim f(x) 4 2. Again, start 
x> 
by sketching a picture and zooming in around (x,y) = (2,2): 
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Try to proceed through the same steps as before: 


e Pick some small number € > 0 and highlight a horizontal strip that contains all y- 
values with |y — L| < e. This means all the y-values have to satisfy L-e < y< L+e. 


e You can see that the graph of the function passes through this strip for some x-values 
close to a. To the left of a, we can always find some x-values that make the function 
sit inside the horizontal-e-strip. However, unlike the previous example, there is a 
problem to the right of a. Even for x-values just a little larger than a, the value of 
f(x) lies well outside the horizontal-e-strip. 


e So given this choice of €, we can find a 6 > 0 so that for x inside the vertical strip 
a—06 <x <a, the value of the function sits inside the horizontal-e-strip. 


e Unfortunately, there is no way to choose a 6 > 0 so that for x inside the vertical strip 
a<x<a+06(with x ¥ a) the value of the function sits inside the horizontal-epsilon- 
strip. 


e So it is impossible to choose 6 so that for x inside the vertical strip a—6 <x<a+t6 
the value of the function sits inside the horizontal strip L— e < y = f(x) <L+e. 


e Thus the limit of f(x) as x — 2 is not 2. 


Campi 1 7a 


Doing things formally with e’s and 0’s is quite painful for general functions. It is far 
better to make use of the arithmetic of limits (Theorem 1.4.2) and some basic building 
blocks (like those in Theorem 1.4.1). Thankfully for most of the problems we deal with in 
calculus (at this level at least) can be approached in exactly this way. 

This does leave the problem of proving the arithmetic of limits and the limits of the 
basic building blocks. The proof of the Theorem 1.4.2 is quite involved and we leave it to 
the very end of this Chapter. Before we do that we will prove Theorem 1.4.1 by a formal 
e-6 proof. Then in the next section we will look at the formal definition of limits at infinity 
and prove Theorem 1.5.3. The proof of the Theorem 1.5.9, the arithmetic of infinite limits, 
is very similar to that of Theorem 1.4.2 and so we do not give it. 

So let us now prove Theorem 1.4.1 in which we stated two simple limits: 


limc=c and lim x = a. 
xa xa 
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Here is the formal e-6 proof: 


Proof of Theorem 1.4.1. Since there are two limits to prove, we do each in turn. Let a,c be 
real numbers. 


e Let e > 0 and set f(x) = c. Choose 5 = 1, then for any x satisfying |x — a| < 6 (or 
indeed any real number x at all) we have |f(x) —c| = 0 < e. Hence lim c= <¢25 


required. 


e Let e > 0 and set f(x) = x. Choose 6 = e, then for any x satisfying |x —a| < 6 we 
have 


a—5<x<a+obut f(x) =xandd=eso 
a—e< f(x)<a+e 


Thus we have | f(x) — a| < e. Hence lim x = a as required. 
x—a 


This completes the proof. 


1.8 4 (optional) — Making infinite limits a little more formal 


For those of you who made it through the formal e — 6 definition of limits we give the 
formal definition of limits at infinity: 


Let f be a function defined on the whole real line. We say that 
the limit as x approaches « of f(x) is L 

or equivalently 
f(x) converges to L as x goes to 


and write 


if and only if for every e > 0 there exists M € R so that | f(x) — L| < e whenever 
MA, 
Similarly we write 


lina hee 


Of —-—10,6) 


if and only if for every € > 0 there exists N € R so that | f(x) — K| < e whenever 
NG 
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Note that we can loosen the above requirement that f be defined on the whole real line 
— all we actually require is that it is defined all x larger than some value. It would be 
sufficient to require “there is some xo € R so that f is defined for all x > x0. 

For completeness lets prove Theorem 1.5.3 using this form definition. The layout of 
the proof will be very similar to our proof of Theorem 1.4.1. 


Proof of Theorem 1.5.3. There are four limits to prove in total and we do each in turn. Let 
ceR. 

e Let ce > 0 and set f(x) = c. Choose M = 0, then for any x satisfying x > M (or 

indeed any real number «x at all) we have |f(x) —c| = 0 < e. Hence lim c = cas 


xX— 00 
required. 


e The proof that lim c = c is nearly identical. Again, let e > 0 and set f(x) = c. 
x 


Choose N = 0, then for any x satisfying x < N we have |f(x) —c| = 0 < e. Hence 
lim c = c as required. 


X—>— 
e Let e > 0 and set f(x) = x. Choose M = 1. Then when x > M we have 


0<M<x divide through by xM to get 
1 1 
0) — —— 
ae € 


Since x > 0, 1/x = |1/x| = |1/x — 0| < € as required. 


e Again, the proof in the limit to —co is similar but we have to be careful of signs. Let 
€ > Oand set f(x) = x. Choose N = —+. Then when x < N we have 


0 > N > xdivide through by xN to get 
1 1 
CSS == 
> oP € 
Notice that by assumption both x,N < 0,so xN > 0. Now since x < 0, 1/x = 
—|1/x| = |1/x — 0| < € as required. 


This completes the proof. 


1.9 4 (optional) — Proving the arithmetic of limits 


Perhaps the most useful theorem of this chapter is Theorem 1.4.2 which shows how lim- 
its interact with arithmetic. Before we get to the proofs it is very helpful to prove three 
technical lemmas that we'll need. The first is a very general result about absolute values 
of numbers: 


Lemma 1.9.1 (The triangle inequality). 


For any x, ye R 


|x + y| < |x| + ly| 
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Proof. Notice that for any real number x, we always have |x|? = x”. So now let x,y € R. 
Then 


jx + yl? = (x+y)? expand it 
=x +2xy+y use x* = |x|? 
= |x|? + 2xy + ly? always have |x| - |y| > x-y 
< |x| + 2|x|lyl + ly? factor it 
= (Ix] + Iyl)? 


Thus we have shown that |x + y|? < (|x| + |y|)?. Now take square-roots of both sides to 
recover the triangle inequality. 


Strictly speaking, the last step in the proof relies on the fact that the square-root is an 
increasing function. Ie if 0 < a < b then 0 < \/a < Vb. You can either prove this, or you 
can avoid it entirely by doing some case analysis. 

The second lemma is more specialised. It proves that if we have a function f(x) > F 
as x — a then there must be a small window around x = a where the function f(x) must 
only take values close to F. In particular it tells us that | f(x)| cannot be bigger than 2|F| 
when x is very close to a. 


Lemma 1.9.2. 


Let a € R and let f be a function so that lim f(x) = F. Then there exists a 5 > 0 


so that if |x — a| < 6 then we also have | f(x)| < 2|F]. 


The proof is mostly just manipulating the e-d definition of a limit with e = |F]. 


Proof. Let € = |F|. Then since f(x) — Fas x — a, there exists 6 > 0 so that when 
|x —a| < 6, wealso have | f(x) — F| < € = |F|. So now assume |x — a| < 6. Then 


—e<f(x)-F<e rearrange a little 
—e+F <f(x)<e+F 


Now e+ F <e€+|F| and —e—|F| < -e+F,so 
—e—|F| < f(x) <e+|F| 


Hence we have |f(x)| < e+ |F| = 2\F|. 


Finally our third technical lemma gives us a bound in the other direction; it tells us 
that when x is close to a, the value of | f (x)| cannot be much smaller than |F}. 


Lemma 1.9.3. 


Let a € Rand let f be a function so that lim f(x) = F. Then there exists 6 > 0 so 


that when |x — a| < 6, we have |f(x)| > |FI/2. 
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Proof. Set € = |Fi/2 > 0. Since f(x) — F, we know there exists a 6 > 0 so that when 
|x —a| < 6 we have |f (x) — F| < e. So now assume |x — a| < 6 and |f(x) — F| < e = IFl/2. 
Then 


|F| = |F — f(x) + f(x)| sneak trick 
< |f(x) - Fl +|f(x)| but |f(x) — Fi <e 
<e+|f(x)| 


Hence |f(x)| > |F| — € = |Fl/2 as required. 


Now we are in a position to prove Theorem 1.4.2. The proof has more steps than 
the previous € — 6 proofs we have seen. This is mostly because we do not have specific 
functions f(x) and g(x) and instead must play with them in abstract — and make good 
use of the formal definition of limits. 

We will break the proof into three pieces. The minimum that is required is to prove 
that 


lim(f(x) + g(2)) = F+G 
lim f(x) - g(x) = FG 
lim 1/¢(x) = 1c. 
x—a 
From these three we can prove that 
lim f(x)-¢ =F-c 
lim(f(x) ~(x)) = F-G 
lim f(*)/g(x) = F/G. 
x—a 
The first follows by setting G(x) = c and using lim f(x) - g(x). The second follows by 


setting c = —1, putting h(x) = (—1)- g(x) and then applying both lim f(x) - g(x) and 
lim f (x) + g(x). The third follows by setting h(x) = 1/g(x) and then computing lim f(x) - 
h(x). 


Starting with addition, in order to satisfy the definition of limit, we are going to have 
to show that 


(F(x) + g(x)) — (F + G)| is small 


when we know that | f(x) — F|, |¢(x) — G| are small. To do this we use the triangle inequal- 
ity above showing that 


(F(x) + g(x) — (F + G)| = (f(x) — F) + (g(x) - GI < Lf (x) — FI + g(x) - GI 


This is the key technical piece of the proof. So if we want the LHS of the above to be size 
€, we need to make sure that each term on the RHS is of size ¢/2. The rest of the proof is 
setting up facts based on the definition of limits and then rearranging facts to reach the 
conclusion. 
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Proof. Proof of Theorem 1.4.2 — limit of a sum. Let a € IR and assume that 


lim f(x) =F and lim g(x) =G. 


xa xa 


We wish to show that 


lim f(x) + g(x) =F+G. 


x—a 


Let € > 0 — we have to show that there is a 6 so that when |x — a| < 6 then |(f(x) + 
g(x) - (F+G)| <e. 

Let € > 0 and set €; = € = ¢/2. By the definition of limits, because f(x) — F there 
exists some 6; > 0 so that whenever |x — a| < 61, we also have |f (x) — F| < €. Similarly 
there exists 62 > 0 so that if |x —a| < 4, then we must have |g(x) — G| < €2. So now 
choose 6 = min{61,62} and assume |x — a| < 6. Then we must have that |x —a| < 61,62 
and so we also have 


PuyaHFEl ae Ig(x) — G| < €2 

Now consider |( f(x) + ¢(x)) — (F + G)| and rearrange the terms: 
(F(a) + g(x) - (F+G)| = [(F(®) —F) + (g(x) —G)|_ now apply triangle inequality 
< |f(x) — F| + |g(x) —G| use facts from above 


<€, +€2 
= €. 


Hence we have shown that for any € > 0 there exists some 6 > 0 so that when |x — a| < 
€ we also have |( f(x) + ¢(x)) — (F + G)| < e. Which is exactly the formal definition of the 
limits we needed to prove. 


Let us do similarly for the limit of a product. Some of the details of the proof are very 
similar, but there is a little technical trick in the middle to make it work. In particular we 
need to show that 


If (x) - g(x) — F- G| is small 
when we know that | f(x) — F| and |g(x) — G| are both small. Notice that 
f(x) g(x) —F-G = f(x) -g(x)-F-G+ f(x) -G—f(x)-G 
=0 


= f(x)- g(x) — f(x)-G+f(x):G-F-G 

= f(x) (4) — G) + (F(x) — F)-G 
So if we know | f(x) — F| is small and |g(x) — G| is small then we are done — except that 
we also need to know that f(x) doesn’t become really large at exactly this point — this is 
exactly why we needed to prove Lemma 1.9.2. 


As was the case in the previous proof, we want the LHS to be of size €, so we want the 
two terms on the RHS to be size ¢/2. This means 


e we need |f (x) — F| to be size ¢/2\G|, and 
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e we need |¢(x) — G| to be size €/4|F| since we know that | f(x)| < 2|F| when x is close 
to a. 


Armed with these tricks we turn to the proof. 
Proof. Proof of Theorem 1.4.2 — limit of a product Let a € IR and assume that 
lim f(x) =F and lim g(x) =1G, 


We wish to show that 


Let e > 0. Sete, = elt 6 = rae From this we establish the existence of 6), 69, 63 


which we need below. 
e By assumption f(x) — F so there exists 6; > 0 so that whenever |x — a| < 6, we also 
have | f(x) — F| < e}. 


e Similarly because g(x) — G, there exists d. > 0 so that whenever |x — a| < 62, we 
also have |¢(x) — G| < eo. 


e By Lemma 1.9.2 there exists 63 > 0 so that whenever |x — a| < 63, we also have 
IF(x)| <2|F] 
Let 6 = min{61, 52,63}, assume |x — a| < 6 and consider |f (x) - g(x) — F- G|. Rearrange 
the terms as we did above: 
f(x) - g(x) — F-G| = [f(x)- (g(x) —G) + F(x) -F) -G| 
< |f(x)|-lg(x) — G| + 1G] |f(x) - FI 


By our three dot-points above we know that |f(x) — F| < e; and |g(x) — G| < e2 and 
Lf (x)| < 2|F|, so we have 


f(x) g(x) — F-G| <|f(x)|-e2+ |G] -e1 sub in €1,€2 and bound on f(x) 
€ € 
a7 er ag 
22> ~ 


Thus we have shown that for any € > 0 there exists 6 > 0 so that when |x — a| < 6 we 
also have | f(x) - g(x) — F- G| < e. Hence f(x) - (x) > F-G. 


Finally we can prove the limit of a reciprocal. Notice that 
1 1 G~—g(x) 
g(x) G  g({x)-G 


We need to show the LHS is of size €, so if G — g(x) is small we are done — except if ¢(x) 
or G are zero. By assumption (go back and read Theorem 1.4.2) we have G # 0, and we 
know from Lemma 1.9.3 that |¢(x)| cannot be smaller than |Gl/2. Together these imply that 
the denominator on the RHS cannot be zero and must be at least |G’/2. Thus we need 
|G — g(x)| to be of size € - |GI?/2. 
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Proof. Proof of Theorem 1.4.2 — limit of a reciprocal Let ¢ > 0 and set €, = e|G|*- 5. We 
now use this and Lemma 1.9.3 to establish the existence of 61, 69. 


e Since g(x) — G we know that there exists 6; > 0 so that when |x — a| < 6, we also 
have |¢(x) — G| < e1. 


e By Lemma 1.9.3 there exists 62 so that when |x — a| < 62 we also have |¢(x)| > |G\/2. 
G 


Equivalently, when |x — a| < 62 we also have Pa 


Set 6 = min{6, 2} and assume |x — a| < 6. Then 


<1. 


1 1|_ |G—g(x) 
ar cl | 


g(a) *G 
1 
= |e(x) —G|- by assumption 
Is) — Gl EGS y P 
< 1 sub in € 
G- g(x) ; 
=€E- : since = | <1 
29(x) 29 (x) 
Z€ 


Thus we have shown that for any € > 0 there exists 6 > 0 so that when |x — a| < 6 we also 
have |1/g(x) — 1/G| < e. Hence 1/g(x) > 1/c. 
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Calculus is built on two operations — differentiation, which is used to analyse instan- 
taneous rate of change, and integration, which is used to analyse areas. Understanding 
differentiation and using it compute derivatives of functions is one of the main aims of 
this course. 

We had a glimpse of derivatives in the previous chapter on limits — in particular Sec- 
tions 1.1 and 1.2 on tangents and velocities introduced derivatives in disguise. One of the 
main reasons that we teach limits is to understand derivatives. Fortunately, as we shall 
see, while one does need to understand limits in order to correctly understand deriva- 
tives, one does not need the full machinery of limits in order to compute and work with 
derivatives. The other main part of calculus, integration, we (mostly) leave until a later 
course. 

The derivative finds many applications in many different areas of the sciences. In- 
deed the reason that calculus is taken by so many university students is so that they may 
then use the ideas both in subsequent mathematics courses and in other fields. In almost 
any field in which you study quantitative data you can find calculus lurking somewhere 
nearby. 

Its development! came about over a very long time, starting with the ancient Greek ge- 
ometers. Indian, Persian and Arab mathematicians made significant contributions from 
around the 6" century. But modern calculus really starts with Newton and Leibniz in the 
17 century who developed independently based on ideas of others including Descartes. 
Newton applied his work to many physical problems (including orbits of moons and 
planets) but didn’t publish his work. When Leibniz subsequently published his “calcu- 
lus”, Newton accused him of plagiarism — this caused a huge rift between British and 
continental-European mathematicians which wasn’t closed for another century. 


2.1 4 Revisiting tangent lines 


By way of motivation for the definition of the derivative, we return to the discussion of 
tangent lines that we started in the previous chapter on limits. We consider, in Exam- 


1 A quick google will turn up many articles on the development and history of calculus. Wikipedia has d 


a good one. 
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ples 2.1.2 and 2.1.3, below, the problem of finding the slope of the tangent line to a curve 
at a point. But let us start by recalling, in Example 2.1.1, what is meant by the slope of a 
straight line. 


Example 2.1.1 


In this example, we recall what is meant by the slope of the straight line 
y= axt3 


e We claim that if, as we walk along this straight line, our x—coordinate changes by an 
amount Ax, then our y—coordinate changes by exactly Ay = 3Ax. 


e For example, in the figure on the left below, we move from the point 
(xo,yo) = (1,2 = 4x1+3) 
on the line to the point 
(xy y1) = (6,4= 39 x5+3) 
on the line. In this move our x—coordinate changes by 
Ag=5—1=4 
and our y—coordinate changes by 


Ay =4-2=2 


which is indeed : <4 Ax, as claimed. 


e In general, when we move from the point 


(xo, Yo) = (xo, 4xo + 3) 


on the line to the point 


(x1, y1) = (x1, 4x1 + 3) 
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on the line, our x—coordinate changes by 
Ax = X1 — Xo 


and our y—coordinate changes by 


Ay = ¥1— Yo 
= [541+ 3] — [5x0 + 3] 
= 5 (x4 _ xo) 


which is indeed 5Ax, as claimed. 


e So, for the straight line y = 5x + 3, the ratio ay — a ae 


regardless of the choice of initial point (xo, yo) and final point (x1, y,). This constant 
ratio is the slope of the line y = 5x + 3. 


Cape 211} J 


Straight lines are special in that for each straight line, there is a fixed number m, called 
the slope of the straight line, with the property that if you take any two different points, 


(xo, Yo) and (x1,¥1), on the line, the ratio a = at which is called the rate of change 


always takes the value 4, 


of y per unit rate of change? of x, always takes the value m. This is the property that 
distinguishes lines from other curves. 

Other curves do not have this property. In the next two examples we illustrate this 
point with the parabola y = x?. Recall that we studied this example back in Section 1.1. 
In Example 2.1.2 we find the slope of the tangent line to y = x? at a particular point. We 
generalise this in Example 2.1.3, to show that we can define “the slope of the curve y = ae 


at an arbitrary point x = xo by considering ay = a x with (x1, ¥1) very close to (X90, Yo). 


Example 2.1.2 


In this example, let us fix (xo, Yo) to be the point (2,4) on the parabola y = x”. Now let 
(x1,¥1) = (x1,x7) be some other point on the parabola; that is, a point with x; # xo. 


e Draw the straight line through (xo, yo) and (x1,y1) — this is a secant line and we 


saw these in Chapter 1 when we discussed tangent lines’. : 


e The following table gives the slope, ae of the secant line through (x9, yo) = (2,4) 


and (x1,¥1), for various different choices of (x1,y1 = x7). 


x1 1/15) 19 | 1.99 | 1.999 }o] 2.001 | 2.01 | 21 | 25 
ae 1 | 2.25 | 3.61 | 3.9601 | 3.9960 | o | 4.0040 | 4.0401 | 4.41 | 6.25 
nv — | 3} 35 | 39 | 3.99 | 3.999 }o] 4.001 | 4.01 | 41 ) 45 


xi xo x1i—- us 


2 Inthe “real world” the phrase “rate of change” usually refers to rate of change per unit time. In science 
it used more generally. 
3 If you do not remember this, then please revisit the first couple of sections of Chapter 1. 
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e Sonow we have a big table of numbers — what do we do with them? Well, there are 
messages we can take away from this table. 

; = of the secant 

through (x9, yo) and (x1,y1). This is illustrated in Figure 2.1.1 below — the 

slope of the secant through (x9, yo) and (x1, y1) is different from the slope of the 

secant through (xo, yo) and (x4,y/). 


— Different choices of x; give different values for the slope 


Figure 2.1.1. 


y 


x 


For a curvy curve, different secants have different slopes. 


If the parabola were a straight line this would not be the case — the secant 
through any two different points on a line is always identical to the line itself 
and so always has exactly the same slope as the line itself, as is illustrated in Fig- 
ure 2.1.2 below — the (yellow) secant through (x9, yo) and (x1, ¥) lies exactly 
on top of the (red) line y = 5x a 3. 


Figure 2.1.2. 


x 


For a straight line, all secants have the same slope. 
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— Now look at the columns of the table closer to the middle. As x; gets closer 
and closer to x9 = 2, the slope, 2”, of the secant through (xo, yo) and (x1, 1) 


7 X1—Xo’ 
appears to get closer and closer to the value 4. 


tC Example 2.1.2 _J 
co Example 2.1.3 


It is very easy to generalise what is happening in Example 2.1.2. 


e Fix any point (xo, yo) on the parabola y = x. If (x1,¥1) is any other point on the 
parabola y = x*, then y; = x? and the slope of the secant through (x9,yo) and 
(x1 ,Y1) is 


Yi YO. xt = XG 
x1 — X09 X1 — X09 
_ (x1 = x0) (x1 + %0) 

x1 — x0 
=x,+X9 


slope = since y = x” 


remember a? — b? = (a —b)(a +b) 


You should check the values given in the table of Example 2.1.2 above to convince 
yourself that the slope aoe of the secant line really is xp + x; = 2+ x (since we set 
xo = 2). 


e Now as we move x closer and closer to x9, the slope should move closer and closer 
to 2x9. Indeed if we compute the limit carefully — we now have the technology to 
do this — we see that in the limit as x; — xo the slope becomes 2x9. That is 


2°" — jim (x, + x0) by the work we did just above 


Taking this limit gives us our first derivative. Of course we haven’t yet given the 
definition of a derivative, so we perhaps wouldn’t recognise it yet. We rectify this in 
the next section. 
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Figure 2.1.3. 


y 


Secants approaching a tangent line 


e So it is reasonable to say “as x; approaches Xo, the secant through (x9, yo) and (x1, y1) 
approaches the tangent line to the parabola y = x? at (xo, yo)”. This is what we did 
back in Section 1.1. 


The figure above shows four different secants through (x9, yo) for the curve y = x’. 


The four hollow circles are four different choices of (x1, 1). As (x1, ¥1) approaches 
(xo, yo), the corresponding secant does indeed approach the tangent to y = x? at 
(xo, Yo), which is the heavy (red) straight line in the figure. 


Using limits we determined the slope of the tangent line to y = x at x9 to be 2x. 
Often we will be a little sloppy with our language and instead say “the slope of the 
parabola y = x? at (x0, Yo) is 2x9” — where we really mean the slope of the line 
tangent to the parabola at xo. 


———————————————e Example 213} JI 


2.2 « Definition of the derivative 


We now define the “derivative” explicitly, based on the limiting slope ideas of the previous 
section. Then we see how to compute some simple derivatives. 

Let us now generalise what we did in the last section so as to find “the slope of the 
curve y = f(x) at (xo, yo)” for any smooth enought function f(x). 

As before, let (xo, yo) be any point on the curve y = f(x). So we must have yo = f (xo). 
Now let (x1, 1) be any other point on the same curve. So y; = f(x,) and x1 4 xg. Think 
of (x1, y1) as being pretty close to (x9, yo) so that the difference 


Ax = x1 —X0 


4 The idea of “smooth enough” can be made quite precise. Indeed the word “smooth” has a very precise 
meaning in mathematics, which we won’t cover here. For now think of “smooth” as meaning roughly 
just “smooth”. 
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in x—coordinates is pretty small. In terms of this Ax we have 


xy=Xo+Ax and yy =f (x9 + Ax) 


We can construct a secant line through (x0, yo) and (x1, y;) just as we did for the parabola 
above. It has slope 


yi—Yyo _ f(x0+ Ax) — f(20) 
xX1— Xo Ax 


If f(x) is reasonably smooth®, then as x; approaches x9, i.e. as Ax approaches 0, we would 
expect the secant through (x9, yo) and (x1, y1) to approach the tangent line to the curve 
y = f(x) at (xo, yo), just as happened in Figure 2.1.3. And more importantly, the slope of 
the secant through (x9, yo) and (x1,¥1) should approach the slope of the tangent line to 
the curve y = f(x) at (x9, yo). 


Thus we would expect® the slope of the tangent line to the curve y = f(x) at (xo, yo) 
to be = 


When we talk of the “slope of the curve” at a point, what we really mean is the slope of 
the tangent line to the curve at that point. So “the slope of the curve y = f(x) at (xo, Yo)” 
is also the limit’ expressed in the above equation. The derivative of f(x) at x = x9 is also 
defined to be this limit. Which leads® us to the most important definition in this text: 


Again the term “reasonably smooth” can be made more precise. , 


Indeed, we don’t have to expect — it is! 
This is of course under the assumption that the limit exists — we will talk more about that below. 
We will rename “x9” to “a” and “Ax” to “h”. 


ol 


COND 
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Let a € Rand let f(x) be defined on an open interval’ that contains a. 


e The derivative of f(x) at x = ais denoted f’(a) and is defined by 


if the limit exists. 


e When the above limit exists, the function f(x) is said to be differentiable 
at x = a. When the limit does not exist, the function f(x) is said to be not 
differentiable at x = a. 


e We can equivalently define the derivative f’(a) by the limit 


AG) ag hea) 


xa x—A 


To see that these two definitions are the same, we set x = a+h and then 
the limit as h goes to 0 is equivalent to the limit as x goes to a. 


Lets now compute the derivatives of some very simple functions. This is our first step 
towards building up a toolbox for computing derivatives of complicated functions — this 
process will very much parallel what we did in Chapter 1 with limits. The two simplest 
functions we know are f(x) =c and g(x) =x. = 


Example 2.2.2 (Derivative of f(x) =) 


Let a,c € IR be a constants. Compute the derivative of the constant function f(x) = c at 
.=u. 

We compute the desired derivative by just substituting the function of interest into the 
formal definition of the derivative. 


finn 1 (the definition) 
= lim — (substituted in the function) 
= lim 0 (simplified things) 
=0 


Campi 222} J 
9 Recall, from Definition 0.3.5, that the open interval (c,d) is just the set of all real numbers obeying nN 


c<x<d. 
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That was easy! What about the next most complicated function — arguably it’s this one: 


Example 2.2.3 (Derivative of g(x) = x) 


Let a € R and compute the derivative of g(x) = x atx =a. 
Again, we compute the derivative of g by just substituting the function of interest into 
the formal definition of the derivative and then evaluating the resulting limit. 


Ci I m gat i = 8(@) (the definition) 
_ (a+h)-a ; ; 
= lim —{;—_ (substituted in the function) 
h 
= lim i (simplified things) 
= lim 1 (simplified a bit more) 
=1 


Capi 223} J 


That was a little harder than the first example, but still quite straight forward — start 
with the definition and apply what we know about limits. 
Thanks to these two examples, we have our first theorem about derivatives: 


Theorem 2.2.4 (Easiest derivatives). 


Let a,c € Rand let f(x) = c be the constant function and g(x) = x. Then 


To ratchet up the difficulty a little bit more, let us redo the example we have already 
done a few times f(x) = x*. To make it a little more interesting let’s change the names of 
the function and the variable so that it is not exactly the same as Examples 2.1.2 and 2.1.3. 


(om Example 2.2.5 (Derivative of h(t) = t?) __ 


Compute the derivative of 


h(t) = att =a 
e This function isn’t quite like the ones we saw earlier — it’s a function of ¢ rather 


than x. Recall that a function is a rule which assigns to each input value an output 
value. So far, we have usually called the input value x. But this “x” is just a dummy 
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variable representing a generic input value. There is nothing wrong with calling a 
generic input value t instead. Indeed, from time to time you will see functions that 
are not written as formulas involving x, but instead are written as formulas in ¢ (for 
example representing time — see Section 1.2), or z (for example representing height), 
or other symbols. _ 


e So let us write the definition of the derivative 


Ma\ = tim Let DFO) 
f(a) = lim h 


h-0 


and then translate it to the function names and variables at hand: 


Ha) = him h(a + i — h(a) 


e But there is a problem — “h” plays two roles here — it is both the function name 
and the small quantity that is going to zero in our limit. It is extremely dangerous 
to have a symbol represent two different things in a single computation. We need to 
change one of them. So let’s rename the small quantity that is going to zero in our 
limit from “h” to “At”: 


ay = tn h(a + At) — h(a) 
At—0 At 


e Now we are ready to begin. Substituting in what the function h is, 


+ At)? — a? 
h’ = 1 (a 
(2) At0 At 
2 Se 
. a 2a At+ At©—a : 2 
= jim. Aa (just squared out (a + At)*) 
_ 2a At + At? 
= lim ——_—— 
At>0 At 
= lim (2a + At) 
At-—0 
= 24 


e You should go back check that this is what we got in Example 2.1.3 — just some 
names have been changed. 


Campi 225,—I 


» An important point (and some notation) 


Notice here that the answer we get depends on our choice of a — if we want to know the 
derivative at a = 3 we can just substitute a = 3 into our answer 2a to get the slope is 6. 
If we want to know at a = 1 (like at the end of Section 1.1) we substitute a = 1 and get 
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the slope is 2. The important thing here is that we can move from the derivative being 
computed at a specific point to the derivative being a function itself — input any value 
of a and it returns the slope of the tangent line to the curve at the point x = a, y = h(a). 
The variable a is a dummy variable. We can rename a to anything we want, like x, for 
example. So we can replace every a in 


h'(a) = 2a by x, giving Wi x= 2e 


where all we have done is replaced the symbol a by the symbol x. 
We can do this more generally and tweak the derivative at a specific point a to obtain 
the derivative as a function of x. We replace 


100) — lirn LOA +h) — f(@) 
f'(a) = lim 


with 
f(x) = lim 


which gives us the following definition 


Let f(x) be a function. 


e The derivative of f(x) with respect to x is 


Bice) 


provided the limit exists. 


e If the derivative f’(x) exists for all x € (a,b) we say that f is differentiable 
on (a,b). 


e Note that we will sometimes be a little sloppy with our discussions and 
simply write “f is differentiable” to mean “f is differentiable on an interval 
we are interested in” or “f is differentiable everywhere”. 


Notice that we are no longer thinking of tangent lines, rather this is an operation we 
can do on a function. For example: 


Example 2.2.7 (The derivative of f(x) = 1) 


Let f(x) = z and compute its derivative with respect to x — think carefully about where 
the derivative exists. 


e Our first step is to write down the definition of the derivative — at this stage, we 
know of no other strategy for computing derivatives. 


fix) = lim — i —f() (the definition) 
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e And now we substitute in the function and compute the limit. 


h) — 
(as tim f° + ; f(x) (the definition) 
jae 1 1 ; z ; 
= firm i a a | (substituted in the function) 
1 x- h 
= lira i a (wrote over a common denominator) 


. —h 
— lim i a (started cleanup) 


= lim ——___ 
nO x(x-+ fh) 
1 
ee) 


e Notice that the original function f(x) = + was not defined at x = 0 and the deriva- 
tive is also not defined at x = 0. This does happen more generally — if f(x) is not 
defined at a particular point x = a, then the derivative will not exist at that point 


either. 


_iwiwiwiwiwiwiwiwiwiwiwit#éazii2_#i2_i2_#i2_4i2_i#i_ié_i__# Example 2.2.7 —) 


So we now have two slightly different ideas of derivatives: 


e The derivative f’(a) at a specific point x = a, being the slope of the tangent line to 
the curve at x = a, and 


e The derivative as a function, f’(x) as defined in Definition 2.2.6. 


Of course, if we have f’(x) then we can always recover the derivative at a specific point 
by substituting x = a. 


As we noted at the beginning of the chapter, the derivative was discovered indepen- 
dently by Newton and Leibniz in the late 17" century. Because their discoveries were 
independent, Newton and Leibniz did not have exactly the same notation. Stemming 
from this, and from the many different contexts in which derivatives are used, there are 
quite a few alternate notations for the derivative: 
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Notation 2.2.8. 


The following notations are all used for “the derivative of f(x) with respect to x” 


Df(x) — Dxf (x), 


Df(a) — Dxf(a). 


Some things to note about these notations: 


e We will generally use the first three, but you should recognise them all. The 


notation f’(a) is due to Newton, while the notation a (a) is due to Leibniz. 


They are both very useful. Neither can be considered “better”. 


e Leibniz notation writes the derivative as a “fraction” — however it is def- 
initely not a fraction and should not be thought of in that way. It is just 
shorthand, which is read as “the derivative of f with respect to x”. 


e You read f’(x) as “f—prime of x”, and of as “dee—f—dee-x”, and £ f(x) as 


x 


“dee-f—dee-x. 


e Similarly you read of (a) as “dee-f—dee-x at a”, and £ f(x) |, as “dee-by- 


dee x of f of x at x equals a”. 


» Back to computing some derivatives 


At this point we could try to start working out how derivatives interact with arithmetic 
and make an “Arithmetic of derivatives” theorem just like the one we saw for limits (The- 
orem 2). We will get there shortly, but before that it is important that we become more 
comfortable with computing derivatives using limits and then understanding what the 
derivative actually means. So — more examples. 


- Example 2.2.9 (Svz) 


Compute the derivative, f’(a), of the function f(x) = ./x at the point x = a for any a > 0. 


e So again we start with the definition of derivative and go from there: 


f(a) = tim £2) = LO = jig VEX VA 


xa x—Aa x>a X—A 


e As x tends to a, the numerator and denominator both tend to zero. But y is not 
defined. So to get a well defined limit we need to exhibit a cancellation between the 
numerator and denominator — just as we saw in Examples 1.4.11 and 1.4.16. Now 
there are two equivalent ways to proceed from here, both based on a similar “trick”. 
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e For the first, review Example 1.4.16, which concerned taking a limit involving square- 
roots, and recall that we used “multiplication by the conjugate” there: 


vx-Va_ Vx-va Vxt+va oe __ conjugate 
pag eg TEE (multiplication by 1 = a) 
(vx — va) (vx + va) 


(x —a)(./x + »/a) 
x-a 
~ Garo 
1 


xt a 


(since (A — B)(A+ B) = A? — B*)) 


e Alternatively, we can arrive at eae = Tet oe by using almost the same trick to 


factor the denominator. Just set A = \/x and B = \/a in A* — B* = (A— B)(A +B) 
to get 


x —a = (Vx —Va)(V/x4+ Va) 


and then substitute this little fact into our expression 


vi-va_VE-Va 
x-a (Jx—vVaj/e+ va) (now cancel common factors) 
1 
(e+ va) 


Once we know that ee * = Tet vay we can take the limit we need: 


/ : J/x —r/a 
ae rere 
rea Jk + Ja 
ee 
~ QV/a 


We should think about the domain of f’ here — that is, for which values of a is 
f'(a) defined? The original function f(x) was defined for all x > 0, however the 
derivative f’(a) = 77a is undefined at a = 0. 


If we draw a careful picture of \/x around x = 0 we can see why this has to be the 
case. The figure below shows three different tangent lines to the graph of y = f(x) = 
,/x. As the point of tangency moves closer and closer to the origin, the tangent line 
gets steeper and steeper. The slope of the tangent line at (a, /a) blows up as a — 0. 
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Cele 229 J 
Example 2.2.10 (¢ 
ple 2.2.10 ( {|x|} 


Compute the derivative, f’(a), of the function f(x) = |x| at the point x = a. 


e We should start this example by recalling the definition of |x| (we saw this back in 
Example 1.5.6): 


—x ifx<0O 
| = ie = 0 
x if x > 0. 


It is definitely not just “chop off the minus sign”. 


e This breaks our computation of the derivative into 3 cases depending on whether x 
is positive, negative or Zero. 


e Assume x > 0. Then 


af sn fet) - f(x) 


dx h—-0 h 
|x +h — |x| 
= kim — — 
h—0 h 
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Since x > 0 and we are interested in the behaviour of this function as h — 0 we can 
assume h is much smaller than x. This means x + h > 0 and so |x +h| =x +h. 


kim? t= 
hoo 
= bat a as expected 


e Assume x < 0. Then 


df 4 flx+h) = f(x) 
dx h-0 h 

a |x + h| — |x| 

h—0 h 


Since x < 0 and we are interested in the behaviour of this function as h — 0 we can 
assume h is much smaller than x. This means x + h < 0 and so |x + h| = —(x +h). 
Se (x +h) -—(—x) 
h-0 h 
= lim — = -1 
ho h 


e When x = 0 we have 


j'(0) = tim $040) -F0 
. |0+h| —|0| 
hoo h 


To proceed we need to know if h > 0 or h < 0,so we must use one-sided limits. The 
limit from above is: 


Dae yen since h > 0, |h| =h 


h —h 
lim la = lim since h < 0, |h| = —h 
h-0- h h—-07- h 


Since the one-sided limits differ, the limit as h — 0 does not exist. And thus the 
derivative does not exist as x = 0. 


In summary: 


4 —1 ifx <0 
— |x| DNE ifx=0 
dx 

1 ifx >0 
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tC Example 22.10) JI 


» Where is the derivative undefined? 
f(x)—f (a) 


According to Definition 2.2.1, the derivative f’(a) exists precisely when the limit lim “~— 
x—Aa 
exists. That limit is also the slope of the tangent line to the curve y = f(x) at x = a. That 
limit does not exist when the curve y = f(x) does not have a tangent line at x = a or 
when the curve does have a tangent line, but the tangent line has infinite slope. We have 


already seen some examples of this. 


e In Example 2.2.7, we considered the function f(x) = 1 This function “blows up” 
(i.e. becomes infinite) at x = 0. It does not have a tangent line at x = 0 and its 
derivative does not exist at x = 0. 


e In Example 2.2.10, we considered the function f(x) = |x|. This function does not 
have a tangent line at x = 0, because there is a sharp corner in the graph of y = |x| 
at x = 0. (Look at the graph in Example 2.2.10.) So the derivative of f(x) = |x| does 
not exist at x = 0. 


Here are a few more examples. 


om Example cs a, 


Visually, the function 


0 ifx<0O 
H = 
(x) { 0 


does not have a tangent line at (0,0). Not surprisingly, when a = 0 and h tends to 0 with 
h>0, 
H(a+h)—H(a)  H(h)—H(0) 1 
h 7 h —h 


blows up. The same sort of computation shows that f’(a) cannot possibly exist whenever 
the function f is not continuous at a. 
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Campi 2.2.11 _ 


Example 2.2.12 ($21/8) 


Visually, it looks like the function f(x) = x!/°, sketched below, (this might be a good 
point to recall that cube roots of negative numbers are negative — for example, since 
(—1)° = -1, the cube root of —1 is —1), 


gaa 


has the y-axis as its tangent line at (0,0). So we would expect that f’(0) does not exist. 
Let’s check. With a = 0, 


fa) = jim 04H) = fle) 


as expected. 


Ce ampie 22.12} 
co Example 2.2.13 ($v x1) 


We have already considered the derivative of the function ,/x in Example 2.2.9. We'll now 
look at the function f(x) = ./|x|. Recall, from Example 2.2.10, the definition of |x|. When 
x > 0, we have |x| = x and f(x) is identical to \/x. When x < 0, we have |x| = —x and 
f(x) = /—x. So to graph y = \/|x| when x < 0, you just have to graph y = \/x for x > 0 
and then send x — —x —i.e. reflect the graph in the y-axis. Here is the graph. The pointy 


jl OO) a 
Db ge7s = ONE 


thing at the origin is called a cusp. The graph of y = f(x) does not have a tangent line at 
(0,0) and, correspondingly, f’(0) does not exist because 


cP) HF). we AME et! 
Pat h 7 i ho At Jha nNE 


Campi 2.2.13 J 
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2.3 « Interpretations of the derivative 


In the previous sections we defined the derivative as the slope of a tangent line, using a 
particular limit. This allows us to compute “the slope of a curve!” and provides us with 
one interpretation of the derivative. However, the main importance of derivatives does 
not come from this application. Instead, (arguably) it comes from the interpretation of the 
derivative as the instantaneous rate of change of a quantity. 


» Instantaneous rate of change 


In fact we have already (secretly) used a derivative to compute an instantaneous rate 
of change in Section 1.2. For your convenience we'll review that computation here, in 
Example 2.3.1, and then generalise it. 


Example 2.3.1 


You drop a ball from a tall building. After f seconds the ball has fallen a distance of 
s(t) = 4.9t metres. What is the velocity of the ball one second after it is dropped? 


e In the time interval from t = 1 tot = 1 +h the ball travels a distance 


s(1 +h) —s(1) = 4.911 +h)? — 4.9(1)* = 4.9[2h + h?| 
e So the average velocity over this time interval is 


average velocity froomt=1tot=1+h 
__ distance travelled fromt=1tot=1+h 
7 length of time from t = 1ltot=1+h 
_ s(1+h) —s(1) 

7 h 

4.9|2h + h?| 

h 
= 4.9/2 +h] 


e The instantaneous velocity at time t = 1 is then defined to be the limit 


instantaneous velocity at time tf = 1 
= lim [average velocity from t = 1 tof =1+h] 
_s(1+h)—s(1) _, 
i h (1) 
= lim 4.9[2 + h] 
h-0 


= 9.8m/sec 


10 Again — recall that we are being a little sloppy with this term — we really mean “The slope of the d 


tangent line to the curve”. 
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e We conclude that the instantaneous velocity at time tf = 1, which is the instantaneous 
rate of change of distance per unit time at time t = 1, is the derivative s’(1) = 
9.8m/sec. 


Capi 231} J 


Now suppose, more generally, that you are taking a walk and that as you walk, you 
are continuously measuring some quantity, like temperature, and that the measurement 
at time t is f(t). Then the 


average rate of change of f(t) fromt =atot=a+h 
__ change in f(t) from t =atot=a+h 
~ length of time from t =atot=a+h 


fla +h) - f(a) 
h 


so the 


instantaneous rate of change of f(t) att =a 
= lim [average rate of change of f(t) from t = a tot =a+h| 


— tim flat) =F) 


h-0 h 


= f"(0) 


In particular, if you are walking along the x-axis and your x—coordinate at time f is x(t), 
then x’(a) is the instantaneous rate of change (per unit time) of your x-coordinate at time 
t = a, which is your velocity at time a. If v(t) is your velocity at time t, then v(a) is the 
instantaneous rate of change of your velocity at time a. This is called your acceleration at 
time a. 


» Slope 


Suppose that y = f(x) is the equation of a curve in the xy-plane. That is, f(x) is the 
y—coordinate of the point on the curve whose x-coordinate is x. Then, as we have already 
seen, 


[the slope of the secant through (a, f(a)) and (a +h, f(a+h))] = flat e =f (a) 


This is shown in Figure 2.3.1 below. 
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Figure 2.3.1. 


In order to create the tangent line (as we have done a few times now) we squeeze 
h — 0. As we do this, the secant through (a, f(a)) and (a +h, f(a+h)) approaches"! the 
tangent line to y = f(x) at x = a. Since the secant becomes the tangent line in this limit, 
the slope of the secant becomes the slope of the tangent and 


[the slope of the tangent line to y = f(x) atx =a] = lim sas 0 =i) 


= f(a. 


Let us go a little further and work out a general formula for the equation of the tangent 
line to y = f(x) at x = a. We know that the tangent line 


e has slope f’(a) and 
e passes through the point (a, f(a)). 


There are a couple of different ways to construct the equation of the tangent line from this 
information. One is to observe, as in Figure 2.3.2, that if (x,y) is any other point on the 
tangent line then the line segment from (a, f(a)) to (x,y) is part of the tangent line and so 
also has slope f’(a). That is, 


y — f(a) 


x—Aa 


= [the slope of the tangent line] = f(a) 


Cross multiplying gives us the equation of the tangent line: 


y-fla)=fila)(x-a) or y= f(a) + f'(a) (x-a) 


11 Weare of course assuming that the curve is smooth enough to have a tangent line at a. d 
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Figure 2.3.2. 


x 


A line segment of a tangent line 


A second way to derive the same equation of the same tangent line is to recall that 
the general equation for a line, with finite slope, is y = mx + b, where m is the slope and 
b is the y-intercept. We already know the slope — so m = f'(a). To work out b we use 
the other piece of information — (a, f(a)) is on the line. So (x,y) = (a, f(a)) must solve 
y = f'(a)x +b. That is, 

f(a) = f'(a)-a+b and so b = f(a) —af'(a) 
Hence our equation is, once again, 

y = f'(a)-x+ (f(a) —af'(a)) or, after rearranging a little, 

y = f(a) + f'(a) (x-@) 


This is a very useful formula, so perhaps we should make it a theorem. 


Theorem 2.3.2 (Tangent line). 


The tangent line to the curve y = f(x) at x = ais given by the equation 


y = f(a) + f(a) (x-4) 


provided the derivative f’(a) exists. 


The caveat at the end of the above theorem is necessary — there are certainly cases in 
which the derivative does not exist and so we do need to be careful. 


Example 2.3.3 


Find the tangent line to the curve y = \/x atx = 4. 

Rather than redoing everything from scratch, we can, and for efficiency, should, use 
Theorem 2.3.2. To write this up properly, we must ensure that we tell the reader what we 
are doing. So something like the following: 
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e By Theorem 2.3.2, the tangent line to the curve y = f(x) at x = ais given by 
y = f(a) + f'(a)(x -a) 
provided f’(a) exists. 


e In Example 2.2.9, we found that, for any a > 0, the derivative of \/x at x = a is 


1 
/ ——  —<$=$$——} 
f@=55 
e In the current example, a = 4 and we have 
Met Oews ~=v'-—2 ed. Poof oo |. aes 
— 2Jala-a 2/4 4 


e So the equation of the tangent line to y = \/x at x = 4is 


1 x 
y=2+ 5 (x-4) oe yaar 


We don’t have to write it up using dot-points as above; we have used them here to help 
delineate each step in the process of computing the tangent line. 


Example 2.3.3 _) 


2.4 4 Arithmetic of derivatives - a differentiation toolbox 


So far, we have evaluated derivatives only by applying Definition 2.2.1 to the function 
at hand and then computing the required limits directly. It is quite obvious that as the 
function being differentiated becomes even a little complicated, this procedure quickly 
becomes extremely unwieldy. It is many orders of magnitude more efficient to have access 
to 


e a list of derivatives of some simple functions and 


e a collection of rules for breaking down complicated derivative computations into 
sequences of simple derivative computations. 


This is precisely what we did to compute limits. We started with limits of simple functions 
and then used “arithmetic of limits” to computed limits of complicated functions. 

We have already started building our list of derivatives of simple functions. We have 
shown, in Examples 2.2.2, 2.2.3, 2.2.5 and 2.2.9, that 


d d d , dz 1 
dx : dx dx * a 24/X 


We'll expand this list later. 
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We now start building a collection of tools that help reduce the problem of computing 
the derivative of a complicated function to that of computing the derivatives of a number 
of simple functions. In this section we give three derivative “rules” as three separate 
theorems. We’ll give the proofs of these theorems in the next section and examples of how 
they are used in the following section. 

As was the case for limits, derivatives interact very cleanly with addition, subtraction 
and multiplication by a constant. The following result actually follows very directly from 
the first three points of Theorem 1.4.2. 


Lemma 2.4.1 (Derivative of sum and difference). 


Let f(x), ¢(x) be differentiable functions and let c € IR be a constant. Then 


AF) + 90)} = £1) +90) 
d 


—{F() - 900} = f@) - 8) 
{ef (x)} = ef'(2) 


That is, the derivative of the sum is the sum of the derivatives, and so forth. 


Following this we can combine the three statements in this lemma into a single rule 
which captures the “linearity of differentiation”. 


Theorem 2.4.2 (Linearity of differentiation). 


Again, let f(x),¢(x) be differentiable functions, let «,B € IR be constants and 
define the “linear combination” 


S(x) = af(x) + Bg(x). 


Then the derivative of S(x) at x = a exists and is 


S = 8'(x) = af'(x) + Bo'(2) 


Note that we can recover the three rules in the previous lemma by setting a = 
o— org — 1) orn — 0, — 0) 


Unfortunately, the derivative does not act quite as simply on products or quotients. 
The rules for computing derivatives of products and quotients get their own names and 
theorems: 
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Theorem 2.4.3 (The product rule). 


Let f(x),g(x) be differentiable functions, then the derivative of the product 
f(x)g(x) exists and is given by 


4 fF() g(x)} = F0) glo) + Fle) 8'(). 


Before we proceed to the derivative of the ratio of two functions, it is worth noting a 
special case of the product rule when g(x) = f(x). In fact, since this is a useful special 
case, let us call it a corollary“: 


Corollary 2.4.4 (Derivative of a square). 


Let f(x) be a differentiable function, then the derivative of its square is: 


With a little work this can be generalised to other powers — but that is best done once 
we understand how to compute the derivative of the composition of two functions. That 
requires the chain rule (see Theorem 2.9.2 below). But before we get to that, we need to 
see how to take the derivative of a quotient of two functions. 


Theorem 2.4.5 (The quotient rule). 


Let f(x), @(x) be differentiable functions. Then the derivative of their quotient is 


d {aa _ F(x) g(%) — f(x) g() 
dx ( g(x) cae 


This derivative exists except at points where ¢(x) = 0. 


There is a useful special case of this theorem which we obtain by setting f(x) = 1. In 
that case, the quotient rule tells us how to compute the derivative of the reciprocal of a 
function. 


12 Recall that a corollary is an important result that follows from one or more theorems — typically with- d 


out too much extra work — as is the case here. 


123 


DERIVATIVES 2.5 PROOFS OF THE ARITHMETIC OF DERIVATIVES 


Corollary 2.4.6 (Derivative of a reciprocal). 


Let ¢(x) be a differentiable function. Then the derivative of the reciprocal of g is 


given by 
d ty oa 
dx (g(x) J 


and exists except at those points where g(x) = 


So we have covered, sums, differences, products and quotients. This allows us to 
compute derivatives of many different functions — including polynomials and rational 
functions. However we are still missing trigonometric functions (for example), and a rule 
for computing derivatives of compositions. These will follow in the near future, but there 
are a couple of things to do before that — understand where the above theorems come 
from, and practice using them. 


2.5 4 Proofs of the arithmetic of derivatives 


The theorems of the previous section are not too difficult to prove from the definition of 
the derivative (which we know) and the arithmetic of limits (which we also know). In this 
section we show how to construct these rules. 

Throughout this section we will use our two functions f(x) and g(x). Since the theo- 
rems we are going to prove all express derivatives of linear combinations, products and 
quotients in terms of f,¢ and their derivatives, it is helpful to recall the definitions of the 
derivatives of f and g: 


f'(x) = lim AC 2 — f(s) and g(x) = lim 


0 h-0 


Our proofs, roughly speaking, involve doing algebraic manipulations to uncover the ex- 
pressions that look like the above. 


» Proof of the linearity of differentiation (Theorem 2.4.2) 


Recall that in Theorem 2.4.2 we defined S(x) = a f(x) + B g(x), where a, B € IR are con- 
stants. We wish to compute S’(x), so we start with the definition: 


Let us concentrate on the numerator of the expression inside the limit and then come back 
to the full limit in a moment. Substitute in the definition of S(x): 


S(x +h) —S(x) = [af(x +h) + Bg(x+h)] — [af(x) + Be(x)] collect terms 
= al f(x +h) — f(x)] + Blg(x +h) — 3(x)] 
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Now it is easy to see the structures we need — namely, we almost have the expressions 
for the derivatives f’(x) and g’(x). Indeed, all we need to do is divide by h and take the 
limit. So lets finish things off. 


from above 


limit laws 


= af'(x) + Bg'(x) 


as required. 


» Proof of the product rule (Theorem 2.4.3) 


After the warm-up above, we will just jump straight in. Let P(x) = f(x) g(x), the product 
of our two functions. The derivative of the product is given by 


Again we will focus on the numerator inside the limit and massage it into the form we 
need. To simplify these manipulations, define 


ee Be and ej, 


Then we can write 
f(x+h) = f(x) +hF(h) and g(x +h) = g(x) +hG(h). 
We can also write 


TiS lim F(h) and g(x) = lim G(h). 
So back to that numerator: 


P(x +h) — P(x) = f(x+h)- g(x +h) — f(x) - g(x) substitute 
= [f (x) + AF(h)] g(x) + hG(h)] — F(x) - (x) expand 
= f(x)g(x) + f(x) -hG(h) + HF(h) - g(x) + WPF (h) - G(h) — f(x) - g(x) 
= f(x) -hG(h) + hF(h) - g(x) + h°F(h)-G(h). 


125 


DERIVATIVES 2.5 PROOFS OF THE ARITHMETIC OF DERIVATIVES 


Armed with this we return to the definition of the derivative: 


” P(x +h) — P(x) 


BAe) ran h 
ign FX) RGU) + HEH) - g(x) + HEC) - G(h) 
h—0 h 
_ f(x)-hG(h)\ | (,. hF(h)-g(x)\ , (,.. WF (h)- G(h) 
= (yn) + ig) + Gig) 


= (tim F(2) c(h)) + (tim F(H) -s(3)) + (tim i (H c(h) 


Now since f(x) and g(x) do not change as we send h to zero, we can pull them outside. 
We can also write the third term as the product of 3 limits: 


= (F(«) im 6(H) ) + (9(2) fim F(H)) + (tim) «(Jim FC) + (ime G00) ) 
= f(x) g(x) + g(x) -f' (2) +0- f'n) -9"(2) 
= f(x) -g"(x) + g(x) -f'(2). 


And so we recover the product rule. 


» (optional) — Proof of the quotient rule (Theorem 2.4.5) 


This one is relatively messy, so we have made it optional. Let us start by writing the 
quotient of our two functions as Q(x) = f(x)/g(x), and we assume that g(x) # 0. As 
before, we start our manipulations from the definition of the derivative 


Q'(x) = fir 1 (now substitute in Q) 
= bee i ; — ; f - (and then form a common denominator) 
— jim LOT) 8) — FX) (x +h) 
ids g(x +h) -g(a) 


We make use of the same F,G that we used in the proof of the product rule. We use 
them to rewrite f(x +h) as f(x) +hF(h) and g(x +h) as g(x +h) = g(x) +hG(h). First 
concentrate on the numerator of the above expression: 


fla +h) - g(x) — f(x) g(x +h) = [f(x) +hF(h)] g(x) — f(x): ist 


= f(x) - g(x) +hg(x) -F(h) 
= hg(x)- F(h) —hf(x) -G(h 


= 


So dividing this by the denominator, and cancelling the common factor of h, gives 


Q(x+h)- Q(x) _ F(t) -g(x) - f(x) -G(n) 
h g(x+h) - g(x) 
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To arrive at Q’(x) we take the limit of the above as h — 0; since it is the limit of a ratio 
with a non-zero denominator, the result is a ratio of limits: 


oy — a, Eli) g(x) — F(x) -G(h) 
Oe) = ge +h) -g(@) 
Fim (FC) g(x) — fla) GH) 


7 lim [g(x +h) - g(x)] 


Processing the denominator gives 
lim [g(x +h) -9(x)| =9(x)- lim g(x +h) since ¢(x) doesn’t change with h 
_ 1 
= 8(x) 


While the numerator is 


lim [F(h) + g(x) — f() -G(h)] = lim F(h) - g(x) — lim f(x) -G(H) 


Now because f(x), ¢(x) do not change with h we can factor them out. 
(x) lim F(n) ~ f(x) lim G(H) 
(x) f'(x) — F(x)g"(x) 


Putting these pieces back together finally gives 


1) — SDF) ~ fle)e!) 
a ae 


And we arrive at the quotient rule. Note that we have assumed that the derivatives 
f'(x),g'(x) exist and that ¢(x) 4 0. 


2.6 4 Using the arithmetic of derivatives — examples 


In this section we illustrate the computation of derivatives using the arithmetic of deriva- 
tives — Theorems 2.4.2, 2.4.3 and 2.4.5. To make it clear which rules we are using during 
the examples we will note which theorem we are using: 


e LIN to stand for “linearity” f {a f(x) +B 9(x)} =a f'(x) + Bg'(x) Theorem 2.4.2 


e PR to stand for “product rule” 2 £ f(x) g(x)} = f’(x) g(x) + f(x) g/(x) Theorem 2.4.3 


e OR to stand for “quotient rule” a { L(x \ —f@)s ae 3'(x) Theorem 2.4.5 
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We'll start with a really easy example. 


(— Example > —$$<<<——————_— 4 


d d d 
qx Ax +7 Pia) forte LIN 


=4.14+7-0=4 


where we have used LIN with f(x) = x, g(x) =1a=4,B =7 


oe Se Ee Example 2.6.1 es) 


Example 2.6.2 


Continuing on from the previous example, we can use the product rule and the previous 
result to compute 


d d d 

qu ix(4e +7)} sa gern + 7} + (4x 4 7) 7, (33 PR 
=x-4+(4x+7)-1 
= 8x+7 


where we have used the product rule PR with f(x) = x and g(x) = 4x +7. 


eee Example 2.6.2 __J 


Example 2.6.3 


In the same vein as the previous example, we can use the quotient rule to compute 


d x (4x+7)-Afx}—x-L{4x +7} 
mE} is (4x +7)? oe 
_ (4x+7)-1-x-4 
(4x + 7)? 
_ 7 
~ (4x +7)2 


where we have used the quotient rule QR with f(x) = x and g(x) =4x +7. 


eee Example 2.6.3 __J 


Now for a messier example. 


co Example 2.6.4 $$ SCS:0 


Differentiate 


x 
f(x) = —_ 
2x + 3x+1 


This problem looks nasty. But it isn’t so hard if we just build it up a bit at a time. 
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e First, f(x) is the ratio of 


1 
fi(x) =x an fo(x) = 2x 4+ apa 
If we can find the derivatives of f(x) and f(x), we will be able to get the derivative 
of f(x) just by applying the quotient rule. The derivative, f| (x) = 1, of f(x) is easy, 
so let’s work on f2(x). 


The function f2(x) is the linear combination 


fo(x) = 2f3(x) + fa(x) with fa(x) =x and fa(x) = — 


If we can find the derivatives of f3(x) and f4(x), we will be able to get the derivative 
of f2(x) just by applying linearity (Theorem 2.4.2). The derivative, f5(x) = 1, of 
f3(x) is easy. So let’s work of fa(x). 


The function f4(x) is the ratio 
1 ‘ 
fale) = F(x) with = fs(x) = 3x+1 


If we can find the derivative of f5(x), we will be able to get the derivative of f4(x) 
just by applying the special case the quotient rule (Corollary 2.4.6). The derivative 
of f5(x) is easy. 


e So we have completed breaking down f(x) into easy pieces. It is now just a matter 
of reversing the break down steps, putting everything back together, starting with 
the easy pieces and working up to f(x). Here goes. 


fs(x) =3x+1 s0 fa(x) = 3x ! ©1=3-140=3 LIN 
a dd... fie) 3 

fle) = Fp 80 gfl®) = Fa = Get p os 

fix) = 2fa(%) + fale) 80 $file) = 2658) +A) =2- Goa LIN 
_ fil) dy A@AM) — A AW) 

f(x) _ fo(x) SO dx ( )= , Fol x)" 2 OR 

_ 1[2x + seq] —*[2- aay 
et dal 
Oof! 


e We now have an answer. But we really should clean it up, not only to make it easier 
to read, but also because invariably such computations are just small steps inside 
much larger computations. Any future computations involving this expression will 


129 


DERIVATIVES 2.6 USING THE ARITHMETIC OF DERIVATIVES — EXAMPLES 


be a lot easier and less error prone if we clean it up now. Cancelling the 2x and the 


—2x in 
1 3 1 Ox 
1[2 = 2x 4 2x 4 
Camere eae G41! ve Sea Gea) 
tt Ox 
~ 3x+1' (x+1)2 


and multiplying both the numerator and denominator by (3x + 1)* gives 


1 3 
Bei + Gri? (3x +1)? 


f(x) = ax + at]? (3x +1)2 


(3x +1) +3x 
[2x(3x +1) +1]° 


_ 6x+1 
[6x2 + 2x 4+ 1]2’ 


Ce qxz._ zzz xz z_— zz ZLL$W Example 2.6.4 = 


While the linearity theorem (Theorem 2.4.2) is stated for a linear combination of two 
functions, it is not difficult to extend it to linear combinations of three or more functions 
as the following example shows. 


Example 2.6.5 


We'll start by generalising linearity to three functions. 
d 
qu oF (x) ) + bG(x) + cH(x)} = “{ a-[F(x)] + 1-[bG(x) +cH(x)] } 


= aF'(x) + {bG(x) +cH(x)} 
by LIN with a =a, f(x) = F(x), 68 =1, 
and g(x) = bG(x) +cH(x) 

= aF’(x) + bG’(x) + cH’(x) 
by LIN with « = b, f(x) = G(x), B=c, 
and g(x) = H(x) 


This gives us linearity for three terms, namely (just replacing upper case names by lower 
case names) 


S{af(x) + bg(x) + 0h(x)} = af'(x) + Bg(x) + oh!) 


Just by repeating the above argument many times, we may generalise to linearity for n 
terms, for any natural number n: 


Sar fi(x) + arfolx) +--+ anfin(2)} = aa f(x) + arfh(x) +--+ anf (x) 
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_______e__e i, Example 2.6.5 =) 


Similarly, while the product rule is stated for the product of two functions, it is not 
difficult to extend it to the product of three or more functions as the following example 
shows. 


Example 2.6.6 


Once again, we'll start by generalising the product rule to three factors. 


&{F(x) G(x) H(x)} = F(x) G(x) H(x) + F(x) <{G(x) H(x)} 
by PR with f(x) = F(x) and g(x) = G(x)H(x) 
= F’(x) G(x) H(x) + F(x) {G'(x) H(x) + G(x) H’(x)} 
by PR with f(x) = G(x) and g(x) = H(x) 


This gives us a product rule for three factors, namely (just replacing upper case names by 
lower case names) 


F(x) 8 (x) h(x) } = f(x) g(x) h(x) + f(x) 8'(x) A(x) + F(x) g(x) h(a) 


Observe that when we differentiate a product of three factors, the answer is a sum of three 
terms and in each term the derivative acts on exactly one of the original factors. Just by 
repeating the above argument many times, we may generalise the product rule to give the 
derivative of a product of n factors, for any natural number n: 


&filx) falx) ee fee fei) i OO) 


A fiGe) foley: segfh (x) 
We can also write the above as 


Gh pee ON och |) gS, oy Ta) | Be ery oe 
qx fil ) fal ) fal )} filx) fo(Xx) falX) fil ) fal ) ful ) 


When we differentiate a product of n factors, the answer is a sum of 1 terms and in 
each term the derivative acts on exactly one of the original factors. In the first term, the 
derivative acts on the first of the original factors. In the second term, the derivative acts 
on the second of the original factors. And so on. 

If we make f;(x) = fo(x) =--- = fn(x) = f(x) then each of the n terms on the right 
hand side of the above equation is the product of f’(x) and exactly n — 1 f (x)’s, and so is 
exactly f(x)"'~! f"(x). So we get the following useful result 


d Wo. n— / 
Sfx)" =m ff"). 
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Cs Example 2.6.6 =) 


This last result is quite useful, so let us write it as a lemma for future reference. 


Lemma 2.6.7. 


Let n be a natural number and f be a differentiable function. Then 


This immediately gives us another useful result. 


co Example 2.6.8 


We can now compute the derivative of x” for any natural number n. Start with Lemma 2.6.7 
and substitute f(x) = x and f(x) =1: 


A _____EeEEg 268} JI 


Again — this is a result we will come back to quite a few times in the future, so we 
should make sure we can refer to it easily. However, at present this statement only holds 
when 11 is a positive integer. With a little more work we can extend this to compute x4 
where g is any positive rational number and then any rational number at all (positive or 
negative). So let us hold off for a little longer. Instead we can make it a lemma, since it 
will be an ingredient in quite a few of the examples following below and in constructing 
the final corollary. 


Lemma 2.6.9 (Derivative of x”). 


Let n be a positive integer then 


Back to more examples. 


132 


DERIVATIVES 2.6 USING THE ARITHMETIC OF DERIVATIVES — EXAMPLES 


im Example —.———$—$=———_ {J 


© {2x° + 4°} = 2 {33} 4 {35} 
by LIN with « = 2, f(x) = x?, B = 4, and g(x) =x° 
= 2{3x7} + 4{5x4} 
by Lemma 2.6.9, once with n = 3, and once with n = 5 
= 6x? + 20x* 


tC Example 2610;/J 


Example 2.6.11 


In this example we’ll compute a {(3x + 9) (x? + 4x3) } in two different ways. For the first, 
we'll start with the product rule. 


© {(3x +9)(x2+4x°)} = {< ax +9)} (x? + 4x3) + (8x +9) “ 
= {3x 1+9 xO} (x* + 4x9) + (3x +9) {2x + 4(3x7)} 
= 3(x? + 4x3) + (3x +9) (2x + 12x?) 
= 3x? 4+ 12x3 + (6x7 + 18x + 36x? + 108%) 
= 18x + 117x* + 48x3 


x7 + 4x9} 


For the second, we expand the product first and then differentiate. 


d 2 3 d 2, 3, 4 
qx (Gx + 9) + 4x°)} = qx + 39x" + 12x*} 

= 9(2x) + 39(3x7) + 12(4x3) 
= 18x + 117x? + 48x 


Campi 2.6.11 __J 
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im Example ———————$=—$=——j 


d (4x9 —7x (12x? — 7) (4x? + 1) — (4x3 — 7x) (8x) 
ral Ax? +1 ~ (4x2 + 1)? 
by QR with f(x) = 4x° — 7x, f(x) = 12x? —7, 
and g(x) = 4x* +1, 9’(x) = 8x 
(48x4 — 16x? — 7) — (32x4 — 56x?) 
(4x2 + 1)? 
16x4 + 40x? — 7 
~ (4x2 41/2 


tC Example 2.6.12 __J 


Example 2.6.13 


In this example, we'll use a little trickery to find the derivative of \/x. The trickery consists 
of observing that, by the definition of the cube root, 


x= (¥x)°. 


Since both sides of the expression are the same, they must have the same derivatives: 


d d 


ag Oh = a (9%). 


We already know by Theorem 2.2.4 that 


d 


qit*s =] 


and that, by Lemma 2.6.7 with n = 3 and f(x) = «x, 
do os 3/y\2 ds aol. Halo ds 3 
dx (Wx) =3 (Wx) qe tV*3 =3x qe *} 
Since we know that a (ah = Ls (wx), we must have 
d 
_ 2/3.“ § 3 
1=3x qt V*3 


which we can rearrange to give the result we need 


sl) = 52 
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tC Example 2613-J 


Example 2.6.14 


In this example, we'll use the same trickery as in Example 2.6.13 to find the derivative x? /q 
for any two natural numbers p and q. By definition of the g™ root, 


Sey), 


That is, x? and (x? /q )" are the same function, and so have the same derivative. So we 
differentiate both of them. We already know that, by Lemma 2.6.9 with n = p, 


d = 
qx it? } = px? 
and that, by Lemma 2.6.7 with n = q and f(x) = x?/4, 


Seller N} a9 (aN Sfx} 


Remember that (x”)’ = x“), Now these two derivatives must be the same. So 


pxP-! = q. lPa-p)/9 Ss P/a\ 


dx 


and, rearranging things, 


d 
& fyp/ay — Pyp-1-(pq-p)/4 
qn t* } a 
— P.(pq-q—pa+p)/q 
q 
P e/q-1 
q 
So finally 
| a, ee (2.6.2) 
dx q 


Notice that this has the same form as Lemma 2.6.9, above, except with n = P/q allowed to 
be any positive rational number, not just a positive integer. 


tC Example 26141—J 


Example 2.6.15 (Derivative of x~”) 


In this example we'll use the quotient rule to find the derivative of x~", for any natural 
number m. 
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By the special case of the quotient rule (Corollary 2.4.6) with g(x) = x” and g’(x) = 


mx" 1 


d. pagy af Te) _ na _ —m-1 
dx x } = ==} _ (xm)? = 


Again, notice that this has the same form as Lemma 2.6.9, above, except with n = — 
being a negative integer. 


(ce Example 2.6. aa 


Example 2.6.16 


In this example we’ll use the quotient rule to find the derivative of x~?/1, for any pair 
of natural numbers p and q. By the special case the quotient rule (Corollary 2.4.6) with 
g(x) = x4 and g!(x) = Fx", 


P..p/q-1 
<2} = = 1 = # = P x P/g-1 
xP/q (x?/a)? q 


ee 2.6.16 =) 


Note that we have found, in Examples 2.2.2, 2.6.14 and 2.6.16, the derivative of x" for 
any rational number a, whether 0, positive, negative, integer or fractional. In all cases, the 
answer is 


ares 


Corollary 2.6.17 (Derivative of x’). 


Let a be a rational number, then 


Back in Example 2.2.9 we computed the derivative of ,/x from the definition of the 
derivative. The above corollary (correctly) gives 


but with far less work. 
Here’s an (optional) messy example. 


om Example 2.6.18 (Optional messy example) _ 


Find the derivative of (J¥—-1)(2 \(1 2) 
x x x 
1 Vx(3 + 2x) 


136 


DERIVATIVES 2.7 DERIVATIVES OF EXPONENTIAL FUNCTIONS 


e As we seen before, the best strategy for dealing with nasty expressions is to break 
them up into easy pieces. We can think of f(x) as the five-fold product 


f(x) = fr(x) + fo(x) - fa(x) - Aw) Aw 


with 
fi(x)=vx-1 f(x) =2-x fe(x)=1—x° fa(x) = Vx fo(x) =3 42x 
e By now, the derivatives of the f;’s should be easy to find: 


A@)=s7—¢ file)=-1 A@)=-2 A@)=s— AE) =2 


e Now, to get the derivative f(x) we use the n—fold product rule which was developed 
in Example 2.6.6, together with the special case of the quotient rule (Corollary 2.4.6). 


lain ee A yo td yl de fal its 
i) = APES + fhe s + fhe Aphs  F. ible eg 
ffi, fi, fp fa fs ii 
= 5 fo fs fa fs Ahhsy F. 


_ 1 1 ee 1 2 |(/x-1)(2—x)(1—x?) 
=| 2-x 1-x2 2x | /x(3 + 2x) 


The trick that we used in going from the first line to the second line, namely mul- 
tiplying term number j by I(x) is often useful in simplifying the derivative of a 


f(x) 


product of many factors’’. 


tC Example 26.18} J 


2.7 4 Derivatives of Exponential Functions 


Now that we understand how derivatives interact with products and quotients, we are 
able to compute derivatives of 


e polynomials, 


e rational functions, and 


e powers and roots of rational functions. 


13 Also take a look at “logarithmic differentiation” in Section 2.10. 
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Notice that all of the above come from knowing" the derivative of x” and applying lin- 
earity of derivatives and the product rule. — 

There is still one more “rule” that we need to complete our toolbox and that is the 
chain rule. However before we get there, we will add a few functions to our list of things 
we can differentiate!’. The first of these is the exponential function. 

Let a > 0 and set f(x) = a* — this is what is known as an exponential function. Let’s 
see what happens when we try to compute the derivative of this function just using the 
definition of the derivative. 


= xth _ px 
df fet) — fF) _ 5 ee 
dx ho h h-0 
a’ —1 qa? —1 
af gi = ght 
ae h : h-0 h 


Unfortunately we cannot complete this computation because we cannot evaluate the last 
limit directly. For the moment, let us assume this limit exists and name it 


h-0 h 


It depends only on a and is completely independent of x. Using this notation (which we 
will quickly improve upon below), our desired derivative is now 


ca = Cla) <a". 

Thus the derivative of a* is a* multiplied by some constant — i.e. the function a” is nearly 
unchanged by differentiating. If we can tune a so that C(a) = 1 then the derivative would 
just be the original function! This turns out to be very useful. 

To try finding an a that obeys C(a) = 1, let us investigate how C(a) changes with a. 
Unfortunately (though this fact is not at all obvious) there is no way to write C(a) as a 
finite combination of any of the functions we have examined so far!®. To get started, we'll 
try to guess C(a), for a few values of a, by plugging in some small values of h. 


- Example 2.7.1 ATV] 
h 


Ld 
Let a = 1 then C(1) = lim = 0. This is not surprising since 1* = 1 is constant, and 
2h] 
so its derivative must be zero everywhere. Let a = 2 then C(2) = lim . Setting h to 


h-0 
smaller and smaller numbers gives 


14 Differentiating powers and roots of functions is actually quite a bit easier once one knows the chain , 


rule — which we will discuss soon. 

15 One reason we add these functions is that they interact very nicely with the derivative. Another reason 
is that they turn up in many “real world” examples. 

16 Toabit more be precise, we say that a number q is algebraic if we can write g as the zero of a polynomial 
with integer coefficients. When a is any positive algebraic number other 1, C(a) is not algebraic. A 
number that is not algebraic is called transcendental. The best known example of a transcendental 
number is 7t (which follows from the Lindemann-Weierstrass Theorem — way beyond the scope of this 
course). 
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h 0.1 0.01 | 0.001 | 0.0001 | 0.00001 | 0.000001 | 0.0000001 
ae 0.7177 | 0.6956 | 0.6934 | 0.6932 | 0.6931 | 0.6931 0.6931 
Similarly when a = 3 we get 
h 0.1 0.01 | 0.001 | 0.0001 | 0.00001 | 0.000001 | 0.0000001 
a 1.1612 | 1.1047 | 1.0992 | 1.0987 | 1.0986 | 1.0986 1.0986 
and a = 10 
h 0.1 0.01 | 0.001 | 0.0001 | 0.00001 | 0.000001 | 0.0000001 
oc 2.5893 | 2.3293 | 2.3052 | 2.3028 | 2.3026 | 2.3026 2.3026 


From this example it appears that C(a) increases as we increase a, and that C(a) = 1 for 
some value of a between 2 and 3. 
rl Example 2.7.1 —) 


We can learn a lot more about C(a), and, in particular, confirm the guesses that we 
made in the last example, by making use of logarithms — this would be a good time for 
you to review them. 


» Whirlwind review of logarithms 


Before you read much further into this little review on logarithms, you should first go 
back and take a look at the review of inverse functions in Section 0.6. 


>>> Logarithmic functions 


We are about to define the “logarithm with base q”. In principle, q is allowed to be any 
strictly positive real number, except q = 1. However we shall restrict our attention to 
q > 1, because, in practice, the only q’s that are ever used are e (a number that we shall 
define in the next few pages), 10 and, if you are a computer scientist, 2. So, fix any q > 1 
(if you like, pretend that q = 10). The function f(x) = q* 


e increases as x increases (for example if x/ > x, then 10%” = 10*- 10*’-* > 10* since 
10*°-* > 1) 


1000 is really small) and 


e obeys lim g* = 0 (for example 107 
x — 


e obeys lim g* = + (for example 10710 is really big). 
X— 00 


Consequently, for any 0 < Y < «, the horizontal straight line y = Y crosses the graph of 
y = f(x) = q’ at exactly one point, as illustrated in the figure below. The x-coordinate 
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of that intersection point, denoted X in the figure, is log, (Y). So log, (Y) is the power to 
which you have to raise q to get Y. It is the inverse function of f(x) = q*. Of course we 
are free to rename the dummy variables X and Y. If, for example, we wish to graph our 
logarithm function, it is natural to rename Y > x and X — y, giving 


Let g > 1. Then the logarithm with base q is defined!” by 


y = log, (x) = x = 4! 


Obviously the power to which we have to raise q to get q* is x, so we have both 


log,(q*) = x qe) — x 


From the exponential properties 
g'08aey) =xy = g'°8a%) glogaly) = g!°8a)+108q(y) 
glBal®/¥) — x fy = gl8a\*) /g!e8aly) — glosa(*)—logaly) 
q'°sa*") SS a (qlosalx))" _ gq’ 08a) 
we have 
log, (xy) = log, (x) + log, (y) 
log, (x/y) = log, (x) 7 log, (y) 
log, (x") = rlog, (x) 


17 We can also define logarithms with base 0 < r < 1 but doing so is not necessary. To see this, set d 


q = 1/r > 1. Then it is reasonable to define log, (x) = —log, (x) since 
l -1 
plog,(x) — (<) tes (<) ma eg 
q q 


as required. 


140 


DERIVATIVES 2.7 DERIVATIVES OF EXPONENTIAL FUNCTIONS 


Can we convert from logarithms in one base to logarithms in another? For example, if 
our calculator computes logarithms base 10 for us (which it very likely does), can we also 
use it to compute a logarithm base q? Yes, using 


How did we get this? Well, let’s start with a number x and suppose that we want to 
compute 


y= log, x 
We can rearrange this by exponentiating both sides 
gl = G8 =x 
Now take log base 10 of both sides 
logig 9” = logig x 


But recall that log, (x") = rlog, (x), so 


y logigg = logigx 


_ logio x 
login 4 
» Back to that limit 
Recall that we are trying to choose a so that 
h 
Oe | 
lm = Cle) =1 


We can estimate the correct value of a by using our numerical estimate of C(10) above. 
The way to do this is to first rewrite C(a) in terms of logarithms. 


a — 10!8104 and so at — 10" 08104, 


Using this we rewrite C(a) as 


: 1 hlog,) 4 
= hae —_ 1 
C(a) i h (10 10 ) 


Now set H = hlog,,)(a), and notice that as h — 0 we also have H — 0 


_ logig4 H 
lim SP (10-1) 
107 —1 


=] “li 
08) 0° fim, 
= log, )4-C(10). 


Below is a sketch of C(a) against a. 
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Figure 2.7.1. 


Lb 


LA) 


Remember that we are trying to find an a with C(a) = 1. We can do so by recognising 
that C(a) = C(10) (log, 4) has the following properties. 


e When a = 1, log,,(4) = log;)1 = 0 so that C(a) = C(10) log,)(a) = 0. Of course, 
we should have expected this, because when a = 1 we have a* = 1* = 1 which is 
just the constant function and al =, 


e log.) 4 increases as a increases, and hence C(a) = C(10) log, 4 increases as a in- 
creases. 


e log,)4 tends to +0 as a > %, and hence C(a) tends to + as a > o. 


Hence the graph of C(a) passes through (1,0), is always increasing as a increases and goes 
off to +co as a goes off to +00. See Figure 2.7.1. Consequently’® there is exactly one value 
of a for which C(a) = 1. 


The value of a for which C(a) = 1 is given the name e. It is called Euler’s constant!’. 


In Example 2.7.1, we estimated C(10) ~ 2.3026. So if we assume C(a) = 1 then the above 
equation becomes 


2.3026 - logig axl 


1 
l ~ ——— wx 0. 
8104 ~ 5206 0.4343 


a = 109-4943 = 2.7813 


18 Weare applying the Intermediate Value Theorem here, but we have neglected to verify the hypothesis d 


19 


that log,)(@) is a continuous function. Please forgive us — we could do this if we really had to, but 
it would make a big mess without adding much understanding, if we were to do so here in the text. 
Better to just trust us on this. 

Unfortunately there is another Euler’s constant, y, which is more properly called the Euler—Mascheroni 
constant. Anyway like many mathematical discoveries, e was first found by someone else — Napier 
used the constant e in order to compute logarithms but only implicitly. Bernoulli was probably the first 
to approximate it when examining continuous compound interest. It first appeared explicitly in work 
of Leibniz, though he denoted it b. It was Euler, though, who established the notation we now use and 
who showed how important the constant is to mathematics. 
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This gives us the estimate a ~ 2.7813 which is not too bad. In fact”? 


Equation 2.7.3 (Euler’s constant). 


e = 2.7182818284590452354 ... (2.7.1) 
CUD 


We will be able to explain this last formula once we develop Taylor polynomials later 
in the course. 
To summarise 


Theorem 2.7.4. 


The constant e is the unique real number that satisfies 


oe 
ee 


h 


Further, 


We plot e* in the graph below 


Figure 2.7.2. 


And just a reminder of some of its”! properties... 


20 Recall n factorial, written n! is the product n x (n—1) x (n—2) x +--+ x21. d 


21 ”The function e* is of course the special case of the function a* with a = e. So it inherits all the usual 
algebraic properties of a*.” 
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4. (ex)! =e¥ 


5. lim e* = oo, lim e* =0 
X— 00 Xx——0 
Now consider again the problem of differentiating a*. We saw above that 
: a 
dx 
We can eliminate the C(10) term with a little care. Since we know that dex = e*, we have 
C(e) = 1. This allows us to express 


*=C(a)-a* and C(a) = C(10) -log,;ya which gives <a" = C(10) - log,)a-a* 


1 = C(e) = C(10) - log, ye and so 
C(10) = a 
logig é 


Putting things back together gives 


d a= logig 4 qn 
dx log) e 


=log,a-a’. 
There is more than one way to get to this result. For example, let f(x) = a*, then 


log, f (x) _ xlog,a 
f (x) = ex log, 4 


So if we write ¢(x) = e* then we are really attempting to differentiate the function 


df 4d 
dy = ae (x -log, a). 


In order to compute this derivative we need to know how to differentiate 


d 
qx 89) 


where g is a constant. We'll hold off on learning this for the moment until we have intro- 
duced the chain rule (see Section 2.9 and in particular Corollary 2.9.9). Similarly we'd like 
to know how to differentiate logarithms — again this has to wait until we have learned 
the chain rule. 

Notice that the derivatives 


— x" = nxt! and —e* =e 


dx dx 


are either nearly unchanged or actually unchanged by differentiating. It turns out that 
some of the trigonometric functions also have this property of being “nearly unchanged” 
by differentiation. That brings us to the next section. 
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2.8 4 Derivatives of trigonometric functions 


We are now going to compute the derivatives of the various trigonometric functions, sin x, 
cos x and so on. The computations are more involved than the others that we have done 
so far and will take several steps. Fortunately, the final answers will be very simple. 
Observe that we only need to work out the derivatives of sinx and cos x, since the 
other trigonometric functions are really just quotients of these two functions. Recall: 


cos x 1 1 
cotx = —— csc x = — secx = . 
cos Xx sin x sin x cos x 


The first steps towards computing the derivatives of sin x, cos x is to find their deriva- 
tives at x = 0. The derivatives at general points x will follow quickly from these, using 
trig identities. It is important to note that we must measure angles in radians”, rather 
than degrees, in what follows. Indeed — unless explicitly stated otherwise, any number 
that is put into a trigonometric function is measured in radians. 


» These proofs are optional, the results are not. 
While we expect you to read and follow these proofs, we do not expect you to be able 


to reproduce them. You will be required to know the results, in particular Theorem 2.8.4 
below. — 


> Step 1: L{sinx}| _, 


By definition, the derivative of sin x evaluated at x = 0 is 


. sinh —sin0O _ sinh 
= lim —————_ = lim ——_ 
x=0 h-30 h hoo h 


d_. 
qx sin +} 


We will prove this limit by use of the squeeze theorem (Theorem 1.4.17). To get there we 
will first need to do some geometry. But first we will build some intuition. 


The figure below contains part of a circle of radius 1. Recall that an arc of length h on 
such a circle subtends an angle of h radians at the centre of the circle. So the darkened arc 
in the figure has length i and the darkened vertical line in the figure has length sinh. We 
must determine what happens to the ratio of the lengths of the darkened vertical line and 
darkened arc as h tends to zero. 


22 In science, radians is the standard unit for measuring angles. While you may be more familiar with 


degrees, radians should be used in any computation involving calculus. Using degrees will cause 
errors. Thankfully it is easy to translate between these two measures since 360° = 27 radians. See 
Appendix B.2.1. 
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Here is a magnified version of the part of the above figure that contains the darkened arc 
and vertical line. 


This particular figure has been drawn with h = .4 radians. Here are three more such blow 
ups. In each successive figure, the value of / is smaller. To make the figures clearer, the 
degree of magnification was increased each time h was decreased. 


cc + cc 


As we make ht smaller and smaller and look at the figure with ever increasing magnifi- 
cation, the arc of length h and vertical line of length sini look more and more alike. We 
would guess from this that 


sinh 
li =1 
0 h 
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DERIVATIVES 

The following tables of values 
h sinh sink h sinh sink 
0.4 3894 2735 —0.4 —.3894 2735 
O2 .1987 9934 —O2 —.1987 9934 
0.1 .09983 .9983 —0.1 —.09983 .9983 
0.05 .049979 99938 —0.05 —.049979 99958 
0.01 | .00999983 | .999983 —0.01 | —.00999983 | .999983 
0.001 | .0099999983 | .9999983 —0.001 | —.0099999983 | .9999983 


suggests the same guess. Here is an argument that shows that the guess really is correct. 


>>> Proof that lim sink =1: 


h-—0 


The circle in the figure above has radius 1. Hence 
|PS| = sinh 


|OP| = |OR| = 1 
|OS| = cosh |QR| = tanh 
Now we can use a few geometric facts about this figure to establish both an upper bound 


and a lower bound on sink 
to 0. So the squeeze theorem will tell us that sink also tends to 1 as ht tends to 0. 


e The triangle OPR has base 1 and height sinh, and hence 


area of AOPR = 5 x 1x sinh = sin 


e The triangle OQR has base 1 and height tanh, and hence 
tanh 


area of AOQR = : x 1x tanh = 
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e The “piece of pie” OPR cut out of the circle is the fraction jt of the whole circle 
(since the angle at the corner of the piece of pie is h radians and the angle for the 
whole circle is 27t radians). Since the circle has radius 1 we have 


h h h 
f pie OPR = —. = —q7-P=- 
area of pie O a (area of circle) a8 1 5 


Now the triangle OPR is contained inside the piece of pie OPR. and so the area of the 
triangle is smaller than the area of the piece of pie. Similarly, the piece of pie OPR is 
contained inside the triangle OQR. Thus we have 


area of triangle OPR < area of pie OPR < area of triangle OQR 
Substituting in the areas we worked out gives 


sinh h tanh 


— 


2 2 2 
which cleans up to give 
sinh <h < Sh 
cosh 


We rewrite these two inequalities so that sinh appears in both. 


sinh 


<1. 
h 


e Since sinh < h, we have that 


; sinh sinh 
e Since h < —— we have that cosh < —— 
cosh h 
Thus we arrive at the “squeezable” inequality 
sinh 


h 


cosh < <1 


We know2? that 


lim cosh = 1. 


> 


Since sink is sandwiched between cos h and 1, we can apply the squeeze theorem for limits 


(Theorem 1.4.17) to deduce the following lemma: 


Lemma 2.8.1. 


23 Again, refresh your memory by looking up Appendix A.5. i 
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Since this argument took a bit of work, perhaps we should remind ourselves why we 
needed it in the first place. We were computing 


sinh — sin0 


qx sin ca Late lim, 7 
inh 
= _ = (This is why!) 
1 


This concludes Step 1. We now know that Ls sin X|,_,) = 1. The remaining steps are easier. 


=0 


> Step 2: L{cosx}| _) 


By definition, the derivative of cos x evaluated at x = 0 is 


.cosh—cos0O_,._ cosh—1 
lim. ——_——-_ = lim —_——_ 
h—0 h hoo hh 
Fortunately we don’t have to wade through geometry like we did for the previous step. 
Instead we can recycle our work and massage the above limit to rewrite it in terms of 
expressions involving si Thanks to Lemma 2.8.1 the work is then easy. 
We'll show you two ways to proceed — one uses a method similar to “multiplying 
by the conjugate” that we have already used a few times (see Example 1.4.16 and 2.2.9 ), 


while the other uses a nice trick involving the double—angle formula”. 


>>> Method 1 — multiply by the “conjugate” 


h+1 
Start by multiplying the expression inside the limit by 1, written as a 


cosh—1 cosh—1 cosh+1 


h 7 h ‘cosh +1 
2 
cos‘ h — 1 . > 42 
~ h(1+ cosh) (since (a—b)(a+b) =a — b*) 
‘* 2 h 
= a i) (since sin? h + cos*h = 1) 
sinh sinh 
h 1+ cosh 
Now we can take the limit as h — 0 via Lemma 2.8.1. 
lim cosh—1 _ lim ( = sinh sinh 
h—0 h 0 h 1+ cosh 
a sinh lim sinh 
0 h hoo \1+cosh 
oer 
2 
=0 
24 See Appendix A.12 if you have forgotten. You should also recall that sin? 6 + cos?@ = 1. Sorry for \ 


nagging. 
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>>> Method 2 — via the double angle formula 


The other way involves the double angle formula”, 
cos 20 = 1 — 2sin*(6) or cos 29 — 1 = —2sin?(@) 
Setting @ = h/2, we have 
cosh—1— —2( sin /)? 
h h 
Now this begins to look like ug except that inside the sin(-) we have h/2. So, setting 
6=h/2, 


cosh—1 _ sin? sin? 6 
h 7 6 2 
7 sin@ sin@ 
0 0 
When we take the limit as h — 0, we are also taking the limit as @ = h/2 — 0, and so 
cosh—1 sin@ sin@ 
lim ———— = li . . 
noo kim | ar a | 
sin 6 sin 0 
= lim |—6]-1 ee ie einai 
ne | him | 0 | tim | 0 | 
0-11 
=) 


where we have used the fact that time. smn = 1 and that the limit of a product is the 


product of limits (ie. Lemma 2.8.1 and Theorem 1.4.2). 


Thus we have now produced two proofs of the following lemma: 


Lemma 2.8.2. 


Again, there has been a bit of work to get to here, so we should remind ourselves why 
we needed it. We were computing 


qx 108 x ae lim. 7 
_ cosh —1 
po 
= 0 


Armed with these results we can now build up the derivatives of sine and cosine. 
25 Thope you looked this up in in Appendix A.12. Nag. \ 
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» Step 3: “{sinx} and $ {cos x} for general x 


To proceed to the general derivatives of sin x and cos x we are going to use the above two 
results and a couple of trig identities. Remember the addition formulae*> 


sin(a + b) = sin(a) cos(b) + cos(a) sin(b) 


cos(a + b) = cos(a) cos(b) — sin(a) sin(b) 


To compute the derivative of sin(x) we just start from the definition of the derivative: 


d . _ sin(x +h) —sinx 
— sinx = lim 
dx h—0 h 

sin x cosh + cosx sinh — sinx 


=| 
hoo h 
: . cosh—1 sinh —0 
= lim | sinx cos x 
h—0 h h 
: . cosh—1 . sinh—O 
= sinx lim ———— + cosx lim ———— 
h—0 h h—0 h 
= sinx E cos | + cosx s sins] 
dx x=0 dx x=0 
——— ——s 
-0 =1 
= cosx 


The computation of the derivative of cos x is very similar. 


___ cos(x +h) —cosx 
cos x = lim 


dx h—0 h 
. cosxcosh —sinx sinh —cosx 
= lim 
h—0 h 
: cosh —1 . sinh—O 
= lim | cos x ——- —- sin x ———_ 
5 h h 
— cosh—1 . . sinh—O 
= cosx lim —— — — sinx lim ———— 
= cosx [< cos. sin x E sina] 
dx x=0 dx x=0 
oH _ 
=0 =1 
= —sinx 


We have now found the derivatives of both sin x and cos x, provided x is measured in radians. 


26 You really should. Look this up in Appendix A.8 if you have forgotten. i 
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Lemma 2.8.3. 


—sinx = cosx —cosx = —sinx 


dx dx 


The above formulas hold provided x is measured in radians. 


These formulae are pretty easy to remember — applying a to sinx and cos x just 
exchanges sin x and cos x, except for the minus sign’ in the derivative of cos x. 

Note that if x is measured in degrees then the above work is wrong. There are similar 
formulas, but we need the chain rule to build them — that is the subject of the next section. 
But first we should find the derivatives of the other trig functions. 


» Step 4: the remaining trigonometric functions 


It is now an easy matter to get the derivatives of the remaining trigonometric functions 
using basic trig identities and the quotient rule. Remember”® that 


sin x cos x 1 
tan x = cotx = — = 
cos x sin x tan x 
1 1 
cscx = — secx = 
sin x cos x 
So, by the quotient rule, 
cos x —sinx 
(es d 
d sinx d ({:sinx) cosx—sinx ($ cos x) P 
—tanx = — 5 = sec’ x 
dx dx cosx dx cos* x 
cos x 
d 
d d 1 d (Ssinx) 
— cscx = : = a = —cscxcotx 
dx dx sinx dx sin*x 
—sinx 
d 
d d 1 d (§cosx) 
— secx = — 5 = sec x tan x 
dx dx cos x dx cos*x 
—sinx cos x 
d d cosx d (Scosx)sinx—cosx (4 sinx) 5 
—cotx = - = 7] = —csc x 
dx dx sinx dx sin? x 


27 There is a bad pun somewhere in here about sine errors and sign errors. , 


28 You really should. If you do not then take a quick look at the appropriate appendix. 
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» Summary 


To summarise all this work, we can write this up as a theorem: 


Theorem 2.8.4 (Derivative of trigonometric functions). 


The derivatives of sin x and cos x are 


’ d ’ 
— sinx = cosx —cosx = —sinx 


dx dx 


Consequently the derivatives of the other trigonometric functions are 


2 d = 2 
— tan x = sec’ x —cotx = —csc’ x 


dx dx 


d 
— cscx —cscxcotx — secx = secxtanx 


dx dx 


Of these 6 derivatives you should really memorise those of sine, cosine and tangent. 
We certainly expect you to be able to work out those of cotangent, cosecant and secant. 


2.9 4 One more tool — the chain rule 


We have built up most of the tools that we need to express derivatives of complicated 
functions in terms of derivatives of simpler known functions. We started by learning how 
to evaluate 


e derivatives of sums, products and quotients 
e derivatives of constants and monomials 


These tools allow us to compute derivatives of polynomials and rational functions. In the 
previous sections, we added exponential and trigonometric functions to our list. The final 
tool we add is called the chain rule. It tells us how to take the derivative of a composition 
of two functions. That is if we know f(x) and g(x) and their derivatives, then the chain 
rule tells us the derivative of f (g(x)). 

Before we get to the statement of the rule, let us look at an example showing how such 
a composition might arise (in the “real-world”). 


ge eae 2.9.1 


You are out in the woods after a long day of mathematics and are walking towards your 
camp fire on a beautiful still night. The heat from the fire means that the air temperature 
depends on your position. Let your position at time t be x(t). The temperature of the 
air at position x is f(x). What instantaneous rate of change of temperature do you feel at 
time t? 
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f(x) 


campfire x 


e Because your position at time t is x = x(t), the temperature you feel at time ¢ is 
F(t) = f(x(#)). 


e The instantaneous rate of change of temperature that you feel is F’(t). We have a 
complicated function, F(t), constructed by composing two simpler functions, x(t) 


and f(x). 


e We wish to compute the derivative, F(t) = ¢f(x(t)), of the complicated function 
F(t) in terms of the derivatives, x/(t) and f’(x), of the two simple functions. This is 
exactly what the chain rule does. 


Ce Example 2.9.1 | 


» Statement of the chain rule 


Theorem 2.9.2 (The chain rule — version 1). 


Let a € Rand let g(x) be a function that is differentiable at x = a and set b = g(a). 
Now let f(u) be a function that is differentiable at 1 = b. Then the function 
F(x) = f(u(x)) is differentiable at x = a and 


F(a) = f'(g(a)) 3'(@) 


Here, as was the case earlier in this chapter, we have been very careful to give the point 
at which the derivative is evaluated a special name (i.e. a). But of course this evaluation 
point can really be any point (where the derivative is defined). So it is very common to 
just call the evaluation point “x” rather than give it a special name like “a”, like this: 


Theorem 2.9.3 (The chain rule — version 2). 


Let f and g be differentiable functions then 
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Notice that when we form the composition f(g(x)) there is an “outside” function 
(namely f(x)) and an “inside” function (namely g(x)). The chain rule tells us that when 
we differentiate a composition that we have to differentiate the outside and then multiply 
by the derivative of the inside. 


“F(e)) = f(g) - 


Me ines 
diff outside diff inside 


Here is another statement of the chain rule which makes this idea more explicit. 


Theorem 2.9.4 (The chain rule — version 3). 


Let y = f(u) and u = g(x) be differentiable functions, then 
dy dy du 


dx du dx 


This particular form is easy to remember because it looks like we can just “cancel” the 
du between the two terms. 


dy _ dy di 
~ di dx 


Of course, du is not, by itself, a number or variable? that can be cancelled. But this is 
still a good memory aid. = 

The hardest part about applying the chain rule is recognising when the function you 
are trying to differentiate is really the composition of two simpler functions. This takes a 
little practice. We can warm up with a couple of simple examples. 


Example 2.9.5 


Let f(u) = u° and g(x) = sin(x). Then set F(x) = f(g(x)) = (sin(x))”. To find the 
derivative of F(x) we can simply apply the chain rule — the pieces of the composition 
have been laid out for us. Here they are. 


We now just put them together as the chain rule tells us 


dF 
= = File(s) 8) 


4 : 
= 5(g(x)) - cos( since f’(u) = 5u* 
4 
= 5(sin(x)) me 
29 Inthis context du is called a differential. There are ways to understand and manipulate these in calculus d 


but they are beyond the scope of this course. 
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Notice that it is quite easy to extend this to any power. Set f(u) = u”. Then follow the 
same steps and we arrive at 


F(x) = (sin(x))" F'(x) = n(sin(x))” | cos(x) 


______e_s§q___e_e_e, Example 2.9.5 _J 


This example shows one of the ways that the chain rule appears very frequently — 
when we need to differentiate the power of some simpler function. More generally we 
have the following. 


Example 2.9.6 


Let f(u) = u" and let g(x) be any differentiable function. Set F(x) = f(g(x)) = g(x)". 
Then 


Se =F (o(a)") = ng (xy) 


This is precisely the result in Example 2.6.6 and Lemma 2.6.7. 


Example 2.9.6 —_)) 


Example 2.9.7 


Let f(u) = cos(u) and g(x) = 3x — 2. Find the derivative of 


F(x) = f(g(x)) = cos(3x — 2). 


Again we should approach this by first writing down f and g and their derivatives 
and then putting everything together as the chain rule tells us. 


f(u) = cos(u) f'(u) = -sin(u) 
g(x) =3x-2 g(x) =3 


So the chain rule says 


P'(x) = f'(g(x)) -3'() 
= —sin (g(x)) -3 
= —3sin(3x — 2) 


Campi 297, JI 
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This example shows a second way that the chain rule appears very frequently — when 
we need to differentiate some function of ax + b. More generally we have the following. 


Example 2.9.8 


Let a,b € Rand let f(x) be a differentiable function. Set g(x) = ax + b. Then 


Sf ax +b) = <F(g(x)) 


= f'(g(x)) -9’(x) 
= f'(ax+b)-a 


So the derivative of f(ax +b) with respect to x is just af’ (ax + b). 


Example 298; 


The above is a very useful result that follows from the chain rule, so lets make it a corollary 
to highlight it. 


Corollary 2.9.9. 


Let a,b € Rand let f(x) be a differentiable function, then 


< f(ax +b) = af'(ax +b). 


Example 2.9.10 (Example 2.9.1, continued) 


Let us now go back to our motivating campfire example. There we had 


f(x) = temperature at position x 
x(t) = position at time t 


( 
F(t) = f(x(t)) = temperature at time t 


The chain rule gave 


Notice that the units of measurement on both sides of the equation agree — as indeed 
they must. To see this, let us assume that t is measured in seconds, that x(t) is measured 
in metres and that f(x) is measured in degrees. Because of this F(x(t)) must also be 
measured in degrees (since it is a temperature). 

What about the derivatives? These are rates of change. So 


e F’(t) has units degrees 


degrees 
metre ” 


e f’(x) has units and 


/ : metre 
e x(t) has units BEX. 
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Hence the product 


degrees metre _ degrees 


U / : 
x(t)) x(t) has units = = : 
fea) ae) metre second — second 
has the same units as F’(t). So the units on both sides of the equation agree. Checking that 
the units on both sides of an equation agree is a good check of consistency, but of course 
it does not prove that both sides are in fact the same. 


Compe 29.10) J 


» (optional) — Derivation of the chain rule 


First, let’s review what our goal is. We have been given a function g(x), that is differen- 
tiable at some point x = a, and another function f(u), that is differentiable at the point 
u = b = g(a). We have defined the composite function F(x) = f(g(x)) and we wish to 
show that 


F'(a) = f'(g(a)) -8'(@) 


Before we can compute F’(a), we need to set up some ground work, and in particular 
the definitions of our given derivatives: 
f(b +H) — fd) 


pie 8 fp & pe ST SG) 
a ae fs 2 


We are going to use similar manipulation tricks as we did back in the proofs of the arith- 
metic of derivatives in Section 2.5. Unfortunately, we have already used up the symbols 
“F” and “H”,so we are going to make use the Greek letters , 9. 

As was the case in our derivation of the product rule it is convenient to introduce a 
couple of new functions. Set 


Then we have 


lim 9(H) = f'(b) = f'(s(@)) since b = ¢(a), 2.9.1) 


and we can also write (with a little juggling) 
f(o+ H) = f(b) + He(A) 


Similarly set 


which gives us 


lim y(t) = g'(a) and glath) = g(a) +hy(h). 
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Now we can start computing 


h—0 h 
— jin flO + hh) = FO) 
h—0 h 


Now for the sneaky bit. We can turn f(b + hy(h)) into f(b + H) by setting 
H =hy(h) 
Now notice that as h — 0 we have 
y= 
= igh mh 
= 08 Sa)” 


So as h — 0 wealso have H — 0. 
We now have 


remember that H = hy(h) 


= lim p(H) - lim *y(h) since H > 0ash —0 
= lim 9(H) - lim y(h = f(b) - 9! 
lim g(H) - lim 7(h) fi(b) -g'(a) 


This is exactly the RHS of the product rule. 


» Chain Rule Examples 


We'll now use the chain rule to compute some more derivatives. 


Example 2 rs | 
hs) 


Find (1+ 3x) 
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This is a concrete version of Example 2.9.8. We are to find the derivative of a function 


that is built up by first computing 1 + 3x and then taking the 75 power of the result. So 
we set 

f(u) =u” f' (4) =7ou" 

g(x) =143x gi(x) =3 

75 

F(x) = f(g(x)) = g(x)? = (1+32) 

By the chain rule 
F(x) =f (g(x)) 82) = 75 g(a)" 9 (x) = 75 (14+ 3x)" -3 
= 225 (1+3x)” 


ee Example 2911 
co Example a 


Find © sin(x?). 

In this example we are to compute the derivative of sin with a (slightly) complicated 
argument. So we apply the chain rule with f being sin and g(x) being the complicated 
argument. That is, we set 


f(u) = sinu f'(u) =cosu 
g(x) =x* g(x) = 2x 
F(x) = f(g(x)) = sin (g(x)) = sin(x’) 
By the chain rule 
P'(x) = f'(g(2)) 8!) = c0s (g(x) g!(x) = cos(x2) 2x 


tC Example 29.12) JI 
(a Example 2.9.18 


Find £ ¥/sin(x2). 
In this example we are to compute the derivative of the cube root of a (moderately) 


complicated argument, namely sin(x*). So we apply the chain rule with f being “cube 
root” and g(x) being the complicated argument. That is, we set 


flu) = Yaad fi(u) = hes 
g(x) = sin(x”) g(x) = 2x cos(x?) 
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In computing ¢’(x) here, we have already used the chain rule once (in Example 2.9.12). By 
the chain rule 


P(x) = f'(g(x)) v(x) = Ag(x) 3 -2xc08(x2) 
_ 2x cos(x?) 
3 [sin(x?)]3 


Cape 2.9.13 __J 


Example 2.9.14 


Find the derivative of 4 f(g(h(x))). 
This is very similar to the previous example. Let us set F(x) = f(g(h(x))) with u = 
g(h(x)) then the chain rule tells us 


dF df du 
dx du dx 
d 
——s t nw 
= f'(g(h(x))) - Fs(h(x)) 
We now just apply the chain rule again 
= f'(g(a(x))) - 8'(A(x)) He). 
Indeed it is not too hard to generalise further (in the manner of Example 2.6.6 to find 


the derivative of the composition of 4 or more functions (though things start to become 
tedious to write down): 


fill falal)))) = AAU) <All fal0))) 


= filfo(fa(fa(x)))) - foCfa(falx))) - fslfale)) 
= filfe(fa(fa(x)))) - falfafa(x))) - fa (falx)) - fa) 


—E————— Example 2.9. 14} 


Example 2.9.15 


We can also use the chain rule to recover Corollary 2.4.6 and from there we can use the 
product rule to recover the quotient rule. 
We want to differentiate F(x) = Aes so set f(u) = + and u = g(x). Then the chain rule 


tells us 
d (1) dF df du 
dx {g(x)J dx du dx 
—1 
= a +g (x) 
__ s(x) 
g(x)? 
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Once we know this, a quick application of the product rule will give us the quotient rule. 


d 1) d st! ve 
masa ae 7 
=f (a) aw + f(x): - za} use the result from above 
_# 1 g(x) 
== f'( x) aa — f(x)- Peay place over a common denominator 
_ f(x) g(x) — f(x) -8"(x) 
g(x)? 


which is exactly the quotient rule. 
I ———————————————————— Example 2.9.15 =) 
Cc Example | 


Compute the following derivative: 


a cos a 
dx (4+ x2)° 


This time we are to compute the derivative of cos with a really complicated argument. 


e So, to start, we apply the chain rule with g(x) = at being the really complicated 


argument and f being cos. That is, f(u) = cos(u). Since f’(u) = — sin(u), the chain 
rule gives 


d (= 2+) ; (~ 2+) d {| x°V3+4 x6 
— cos > ) =-sin 3 . 
dx (4+ x?) (4+ x?) dx | (44 x?) 


This reduced our problem to that of computing the derivative of the really compli- 


cated argument ae We can think of the argument as being built up out of three 
(4+27) 


pieces, namely x°, multiplied by V3 + x°, divided by (4 + x2)°, or, equivalently, mul- 
tiplied by (4+.x2) >. So we may rewrite oo as x° (3+ x6)? (44+x2) > and 
then apply the product rule to reduce the problem to that of computing the deriva- 
tives of the three pieces. 


e Here goes (recall Example 2.6.6): 


- [x5 (3 +28)” (44.22) *] = <x] (3+ 28). (44 x2) 
- 7 (3+ x8)" (4+x7)* 
+28. (8498)" Slats) 
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This has reduced our problem to computing the derivatives of x°, which is easy, and 
of (3+ x6)? and (4+ x2), both of which can be done by the chain rule. Doing so, 


5x4 
(eo Gee)? Ga) T= 15) .(3 425)". (44%2)° 
dx dx 


3 (3+-x°)—1/?.625 


d 
rxP. — [34 x®) 7) (4422)? 


e Now we can clean things up in a sneaky way by observing 
— differentiating x°, to get 5x4, is the same as multiplying x° by 2, and 
1 
- differentiating (3+ x°)? to get $(3 + x°)~'”? . 6x° is the same as multiplying 


1 
(3 + x°)? by oS, and 


— differentiating (4+ x2)? to get —3(4+ x2) 4 -2x is the same as multiplying 


=8 
(4+ x*) ~ by — 7s. 


Using these sneaky tricks we can write our solution quite neatly: 


£ c08(*. 2+) _ sin (* 2 x°/3 + x6 {2 AOR 6x ! 
dx \ (4+) J (4422) J (440)% le 34x68 44x 


e This method of cleaning up the derivative of a messy product is actually something 
more systematic in disguise — namely logarithmic differentiation. We will come to 
this later. 


—SSSSSSSS=S=S=S=S== Example 29.16} J 


2.10 4 The natural logarithm 


The chain rule opens the way to understanding derivatives of more complicated function. 
Not only compositions of known functions as we have seen the examples of the previous 
section, but also functions which are defined implicitly. 

Consider the logarithm base e — log, (x) is the power that e must be raised to to give 
x. That is, log,(x) is defined by 


elOBe x —— oe) @ 
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i.e. — it is the inverse of the exponential function with base e. Since this choice of base 
works so cleanly and easily with respect to differentiation, this base turns out to be (ar- 
guably) the most natural choice for the base of the logarithm. And as we saw in our 
whirlwind review of logarithms in Section 2.7, it is easy to use logarithms of one base to 
compute logarithms with another base: 


So we are (relatively) free to choose a base which is convenient for our purposes. 

The logarithm with base e, is called the “natural logarithm”. The “naturalness” of loga- 
rithms base e is exactly that this choice of base works very nicely in calculus (and so wider 
mathematics) in ways that other bases do not*”. There are several different “standard” 
notations for the logarithm base e; a 


log, x = log x = Inx. 


We recommend that you be able to recognise all of these. 

In this text we will write the natural logarithm as “log” with no base. The reason for 
this choice is that base e is the standard choice of base for logarithms in mathematics*! The 
natural logarithm inherits many properties of general logarithms’. So, for all x,y > 0 the 
following hold: 7 
log x 


ee =X, 


for any real number X, log (eX) =X, 


e for anya > 1,log,x = ee and log x 


_ log, x 
~~ log, e 


log1 = 0, loge = 1 


log(xy) = logx + logy 


log (7) = log x — logy, log (j) = —logy 


e log(x*) = Xlogx 


Jim log x = 0, lim log x = —0 


And finally we should remember that log x has domain (i.e. is defined for) x > 0 and 
range (i.e. takes all values in) —1« < x < . 


30 The interested reader should head to wikipedia and look up the natural logarithm. d 


31 In other disciplines other bases are natural; in computer science, since numbers are stored in binary it 
makes sense to use the binary logarithm — i.e. base 2. While in some sciences and finance, it makes 
sense to use the decimal logarithm — i.e. base 10. 

32 Again take a quick look at the whirlwind review of logarithms in Section 2.7. 
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Figure 2.10.1. 


To compute the derivative of log x we could attempt to start with the limit definition 
of the derivative 


ss log(x + h) — log(x) 


dx n-0 h 
en Bl (x +1) /I) 
h—0 h 
=um... 


This doesn’t look good. But all is not lost — we have the chain rule, and we know that the 
logarithm satisfies the equation: 


x= clog x 


Since both sides of the equation are the same function, both sides of the equation have the 
same derivative. i.e. we are using? 


if f(x) = g(x) for all x, then f’(x) = 9"(x) 


So now differentiate both sides: 


Opa Se 
dx” dx 
The left-hand side is easy, and the right-hand side we can process using the chain rule 


with f(u) =e“ and u = log x. 


log x 


af du 


~ du dx 


=¢" 2 qx 08% 
—_SIOS 


what we want to compute 


33 Notice that just because the derivatives are the same, doesn’t mean the original functions are the same. d 


Both f(x) = x* and g(x) = x* +3 have derivative f’(x) = g/(x) = 2x, but f(x) # g(x). 
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Recall that e” = e!°8* = x, so 


dx 


now what? 


ee ee 
= 


We can now just rearrange this equation to make the thing we want the subject: 


Thus we have proved: 


Theorem 2.10.1. 


ingen 
dx -8*~ ¥ 


where log x is the logarithm base e. 


om. Example 210.2 


Let f(x) = log3x. Find f’(x). 
There are two ways to approach this — we can simplify then differentiate, or differen- 
tiate and then simplify. Neither is difficult. 


e Simplify and then differentiate: 


f(x) = log 3x log of a product 
= log3 + log x 


d d 
Jai qy 083 + qx 108 * 
1 
x 


e Differentiation and then simplify: 


f(x) = Ltog(3x) can 
a! 
x 


3 


Cape 2.10.2 _) 


1 
x 
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Example 2.10.3 
Notice we can extend the previous example for any non-zero constant — not just 3. Let c 


be a non-zero constant, then 


d d 

qx log ex = ay (logc + log x) 
_* 
> a 


Rs Example 2103} 
om Example 2.10.4 


We can push this further still. Let g(x) = log |x|, then** 


e Ifx >0, |x| = x and so 


e If x <0 then |x| = —x = —1-x and so 
/ d 1 
g(x) = ee log(—1-x) = - by the previous example 


e Since log 0 is undefined, ¢’(0) does not exist. 
Putting this together gives: 


d, Pee 
dx 8M I~ y 


tC Example 219.4} J 


We can extend Theorem 2.10.1 to compute the derivative of logarithms of other bases 
in a straightforward way. Since for any positive a # 1: 


1 1 
log, x = BO -log x since a is a constant 
loga loga 
lo ens 
dx 2a loga x 


34 It’s probably a good moment to go back and look at Example 2.2.10. d 
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d .x 
» Back to xa 


We can also now finally get around to computing the derivative of a* (which we started 
to do back in Section 2.7). 


i. =a" take log of both sides 
log f(x) = xloga exponentiate both sides base e 
foe Pe chain rule 
i (Hae lose 
=a" -loga 


fn=2 take log of both sides 
log f(x) = xloga differentiate both sides 


d 
< (log f(x) = loga 
We then process the left-hand side using the chain rule 


f(x) aay =loga 


‘(x) = f(x) -loga =a*-loga 


iy ay 


We will see Le log f(x) more below in the subsection on “logarithmic differentiation”. 
To summarise the results above: 


Corollary 2.10.5. 


<a = loga-a’ for any a > 0 
1 


Se Re foranya>0,a#41 


erie XG 
ay 8a 


where log x is the natural logarithm. 


Recall that we need the caveat a # 1 because the logarithm base 1 is not well defined. 
This is because 1* = 1 for any x. We do not need a similar caveat for the derivative of the 
exponential because we know (recall Example 2.7.1) 


—1* = —1=0 while the above corollary tells us 


=log lel? =021-=0, 


168 


DERIVATIVES 2.10 THE NATURAL LOGARITHM 


» Logarithmic differentiation 


I want to go back to some previous slightly messy examples (Examples 2.6.6 and 2.6.18) 
and now show you how they can be done more easily. 


Example 2.10.6 


Consider again the derivative of the product of 3 functions: 
P(x) = F(x) G(x) - H(z) 
Start by taking the logarithm of both sides: 
log P(x) = log (F(x) - G(x) - H(x)) 
= log F(x) + log G(x) + log H(x) 


Notice that the product of functions on the right-hand side has become a sum of functions. 
Differentiating sums is much easier than differentiating products. So when we differenti- 
ate we have 


< log P(x) = log F(x) + log G(x) + log H(x) 


A quick application of the chain rule shows that log fia) =f (x77 @): 
Pt) _ Pe). , oO) a dt oe) 
P(x) F(x) ° G(x) © H(x) 

Multiply through by P(x) = F(x)G(x)H(x): 


= F'(x)G(x)H(x) + F(x)G’(x)H(x) + F(x)G(x)H'(x) 
which is what found in Example 2.6.6 by repeated application of the product rule. The 
above generalises quite easily to more than 3 functions. 
Example 2.10.6} J 


This same trick of “take a logarithm and then differentiate” — or logarithmic differentia- 
tion — will work any time you have a product (or ratio) of functions. 


Example 2.10.7 


Lets use logarithmic differentiation on the function from Example 2.6.18: 
ele =) be 
pa) — WELDO-O-#) 
a/x(3 + 2x) 
Take logarithms of both sides and expand 
(vx —1)(2—x)(1—x*) 
| = 
og f (x) 08 a/x(3 + 2x) 
= log(./x — 1) + log(2 — x) + log(1 — x?) — log(/x) — log (3 + 2x) 
“YY 


log x 
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Now we can essentially just differentiate term-by-term: 


10g f(x) = - (tosivz 1) +log(2 — x) + log(1 — x’) — 5 log(x) ~ log(3-+ 2x) ) 
Pe VIO), le Soe al 2 
flay alga. Dee Tae oe Bae 
1 1 2x1 2 
2) = 40) (pete gg - eb e) 
_We-)Q-a)d=%) 1 1 2, 0 y 
7 /x(3 + 2x) (saeny 2—-x 1-x2 2x 5) 


Example 2107. 


just as we found previously. 


2.11 « Implicit Differentiation 


Implicit differentiation is a simple trick that is used to compute derivatives of functions 
either 


e when you don’t know an explicit formula for the function, but you know an equation 
that the function obeys or 


e even when you have an explicit, but complicated, formula for the function, and the 
function obeys a simple equation. 


The trick is just to differentiate both sides of the equation and then solve for the derivative 
we are seeking. In fact we have already done this, without using the name “implicit 
differentiation”, when we found the derivative of log x in the previous section. There we 
knew that the function f(x) = log x satisfied the equation ef) = x for all x. That is, the 
functions ef(*) and x are in fact the same function and so have the same derivative. So we 


had 
Soft) <x =] 
We then used the chain rule to get Lef(*) = ef(*) f(x), which told us that f’(x) obeys the 
equation 
fey) =] and we can now solve for f’(x) 
fi (x) =e FH) = eax — . 


The typical way to get used to implicit differentiation is to play with problems involv- 
ing tangent lines to curves. So here are a few examples finding the equations of tangent 
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lines to curves. Recall, from Theorem 2.3.2, that, in general, the tangent line to the curve 
y = f(x) at (x0, yo) is y = f (x0) + f"(x0)(x — x0) = yo + f’(x0) (x — Xo). 
Example 2.11.1 


Find the equation of the tangent line to y = y?+ xy+x° atx =1. 

This is a very standard sounding example, but made a little complicated by the fact 
that the curve is given by a cubic equation — which means we cannot solve directly for y 
in terms of x or vice-versa. So we really do need implicit differentiation. 


e First notice that when x = 1 the equation, y = y? + xy + x3, of the curve simplifies 


toy = y°+y+1 ory? = —1, which we can solve®: y = —1. So we know that the 


curve passes through (1,—1) when x = 1. — 


e Now, to find the slope of the tangent line at (1,—1), pretend that our curve is y = 
f(x) so that f(x) obeys 


f(x) =fayP+xfaa)+2 
for all x. Differentiating both sides gives 


f' (x) = 3f (x)? f(x) + f(x) + xf! (x) + 3x? 


e At this point we could isolate for f’(x) and write it in terms of f(x) and x, but since 
we only want answers when x = 1, let us substitute in x = 1 and f(1) = —1 (since 
the curve passes through (1,—1)) and clean things up before doing anything else. 


© Subbing in x =1, f(1) = —1 gives 
fi) =f) -14 f'() +3 and so f’(1) = ~5 


e The equation of the tangent line is 


y = yo f(a0)(x = x0) = “1 = 5(x-1) = Fe 5 


We can further clean up the equation of the line to write it as 2x + 3y = —1 


Example 211}, 


In the previous example we replace y by f(x) in the middle of the computation. We don’t 
actually have to do this. When we are writing out our solution we can remember that y is 
a function of x. So we can start with 


yay txy +20 
and differentiate remembering that y = y(x) 
y = 3y°y + xy! +y 43x? 


35 This type of luck rarely happens in the “real world”. But it happens remarkably frequently in textbooks, 


problem sets and tests. 
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And now substitute x = 1, y = —1 to get 


y'(1) =3-y'(1) +y'(1) -14+3 and so 
2 

/ = — — 

y(1) = 5 


The next one is at the same time a bit easier (because it is a quadratic) and a bit harder 
(because we are asked for the tangent at a general point on the curve, not a specific one). 


in Example 2.11.2 


Let (xo, yo) be a point on the ellipse 3x? + 5y? = 7. Find the equation for the tangent lines 
when x = 1 and y is positive. Then find an equation for the tangent line to the ellipse at a 
general point (Xo, yo). 

Since we are not given an specific point xg we are going to have to be careful with the 
second half of this question. 


e When x = 1 the equation simplifies to 


3+5y? =7 

by? =4 
pod 
V5 


We are only interested in positive y, so our point on the curve is (1,2//5). 


e Now we use implicit differentiation to find wy at this point. First we pretend that we 
have solved the curve explicitly, for some interval of x’s, as y = f(x). The equation 


becomes 
Bx? Eb f(x) = 7 now differentiate 
6x + 10f (x) f'(x) =0 
On 
fix) =— 


Sf (x) 
e When x = 1,y = 2/5 this becomes 


; _ 3 _ 3 
Ps 5-2/5 2/5 


So the tangent line passes through (1,2//5) and has slope ae Hence the tangent 


line has equation 


y = yo + f'(xo) (x — xo) 


2 3 
=e ye) 
129K : 
at iB or equivalently 
3x +2V5y =7 
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Now we should go back and do the same but for a general point on the curve (x9, yo): 


e A good first step here is to sketch the curve. Since this is an ellipse, its pretty straight- 
forward. 


e Notice that there are two points on the ellipse — the extreme right and left points 
(xo, Yo) =+ (v ee 0) — at which the tangent line is vertical. In those two cases, the 
tangent line is just x = xp. 


e Since this is a quadratic for y, we could solve it explicitly to get 


=e 7 — 3x? 
INS 


and choose the positive or negative branch as appropriate. Then we could differen- 
tiate to find the slope and put things together to get the tangent line. 


But even in this relatively easy case, it is computationally cleaner, and hence less 
vulnerable to mechanical errors, to use implicit differentiation. So that’s what we'll 
do. 


e Now we could again “pretend” that we have solved the equation for the ellipse 
for y = f(x) near (xo, yo), but lets not do that. Instead (as we did just before this 
example) just remember that when we differentiate y is really a function of x. So 
starting from 


3x7 + 5y? = 7 differentiating gives 
6x+5-2y-y' =0 


We can then solve this for y’: 


where y/’ and y are both functions of x. 
e Hence at the point (xo, yo) we have 
| 3X0 


(x0,Yo) 5y0 
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This is the slope of the tangent line at (x9, yo) and so its equation is 
y= yoty':(x— x0) 
= 3X09 
= Yo Ba — xo) 
We can simplify this by multiplying through by 5yo to get 
5yoy = 5y% — 3xox + 3x6 


We can clean this up more by moving all the terms that contain x or y to the left-hand 
side and everything else to the right: 


3xox + 5yoy = 3x6 + 5ye 


But there is one more thing we can do, our original equation is 3x* + 5y? = 7 for all 
points on the curve, so we know that 3x} + 5y3 = 7. This cleans up the right-hand 
side. 


3xox + Syoy = 7 


e In deriving this formula for the tangent line at (xo, yo) we have assumed that yo # 0. 
But in fact the final answer happens to also work when yo = 0 (which means x9 = 
+1/7/3), so that the tangent line is x = x9. 


We can also check that our answer for general (x9, yo) reduces to our answer for x9 = 1. 


e When xo = 1 we worked out that yp = 2/5. 


e Plugging this into our answer above gives 


3xox + 5yoy = 7 sub in (x9, yo) = (1,2/ V5) : 
2 

3x + 5—y =7 clean up a little 
rie P 


3x + 2V5y = 7 


as required. 


ee ___ Example 2.11.2 =) 


Example 2.11.3 


At which points does the curve x” 


the curve at those points parallel? 
This is a 2 part question — first the x-intercepts and then we need to examine tangent 
lines. 


—xy + y* = 3 cross the x-axis? Are the tangent lines to 


e Finding where the curve crosses the x-axis is straight forward. It does so when y = 0. 
This means x satisfies 


x7-x-0+0% =3 sox = +V3. 


So the curve crosses the x-axis at two points (+ V3, 0). 
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e Now we need to find the tangent lines at those points. But we don’t actually need 
the lines, just their slopes. Again we can pretend that near one of those points the 
curve is y = f(x). Applying < to both sides of x* — xf(x) + f(x)* =3 gives 


2x — f(x) —xf'(x) +2f(x) f(z) =0 
etc etc. 


e But let us stop “pretending”. Just make sure we remember that y is a function of x 
when we differentiate: 


—xyty =3 start with the curve, and differentiate 
2x — xy’ —y+2yy’ =0 


Now substitute in the first point, x = +3, y = 0: 


2V3 — V3y'+0=0 
y=2 


And now do the second point x = —V3,y = 0: 


2V3+ v3y' +0=0 
/ 
2 


Y= 


Thus the slope is the same at x = /3 and x = —V3 and the tangent lines are parallel. 


x—ayty?=3 


tC Example 213. 


Okay — lets get away from curves and do something a little different. 


Example 2.11.4 


You are standing at the origin. At time zero a pitcher throws a ball at your head. p 


36 It seems that it is not a friendly game today. d 
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Figure 2.11.1. 


The position of the (centre of the) ball at time f is x(t) = d — vt, where d is the distance 
from your head to the pitcher’s mound and 7 is the ball’s velocity. Your eye sees the ball 
filling *” an angle 20(t) with 


r 


sin (0(t)) Sa apr 


where r is the radius of the baseball. The question is “How fast is 6 growing at time f?” 
That is, what is dé 


37 This is the “visual angle” or “angular size”. 


We don’t know (yet) how to solve this equation to find 0(t) explicitly. So we use 
implicit differentiation. 


To do so we apply a to both sides of our equation. This gives 
6(t))-@’(4) = a 
cos (A(t)) - 6'(E) oF 
Then we solve for 6’(t): 


TO 


a’(t) = (d — vt)? cos (8(t)) 


As is often the case, when using implicit differentiation, this answer is not very sat- 
isfying because it contains 6(t), for which we still do not have an explicit formula. 
However in this case we can get an explicit formula for cos (6(t)), without having 
an explicit formula for (tf), just by looking at the right-angled triangle in Figure 
2.11.1, above. 


The hypotenuse of that triangle has length d — vt. By Pythagoras, the length of the 
side of the triangle adjacent of the angle 6(t) is \/(d — vt)* — 12. So 


cos (0(t)) = (4 ee = 


and 
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tC Example 214. 


Okay — just one more tangent-to-the-curve example and then we'll go on to something 
different. 


Example 2.11.5 


Let (x9, yo) be a point on the astroid*® 
gto T. 
Find an equation for the tangent line to the astroid at (x9, yo). 


e As was the case in examples above we can rewrite the equation of the astroid near 
(xo, yo) in the form y = f(x), with an explicit f(x), by solving the equation x7/? + 
y> = 1. But again, it is computationally cleaner, and hence less vulnerable to me- 
chanical errors, to use implicit differentiation. So that’s what we’ll do. 


e First up, since (x9, Yo) lies on the curve, it satisfies 
2/3 2/3 __ 
Xo +Y¥g = 1. 


e Now, no pretending that y = f(x), this time — just make sure we remember when 
we differentiate that y changes with x. 


ee dy a7 start with the curve, and differentiate 
Zils ch 2-1/3, =% 


e Note the derivative of x7’, namely gx 1/ 3 and the derivative of y! °, namely Sy ay! 


are defined only when x # 0 and y # 0. We are interested in the case that x = xo 
and y = yo. So we better assume that x9 # 0 and yo # 0. Probably something weird 
happens when x9 = 0 or yp = 0. We'll come back to this shortly. 


e To continue on, we set x = xo, ¥ = yo in the equation above, and then solve for y’: 
/3 
2-3 2, -1/ yo\’ 
st + Sy ye) <0 — v(x) = - (# 
This is the slope of the tangent line and its equation is 


1/3 
y= yo+ f"(x0)(x—x0) = yo- (2) (= 20) 


38 Here is where is the astroid comes from. Imagine two circles, one of radius 1/4 and one of radius 1. d 


Paint a red dot on the smaller circle. Then imagine the smaller circle rolling around the inside of the 
larger circle. The curve traced by the red dot is our astroid. Google “astroid” (be careful about the 
spelling) to find animations showing this. 

The astroid was first discussed by Johann Bernoulli in 1691-92. It also appears in the work of Leibniz. 
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Now let’s think a little bit about what the tangent line slope of —/¥0/x tells us about 
the astroid. 


e First, as a preliminary observation, note that since x;/° > 0 and y//* > 0 the equation 


xi? + y¢° = 1 of the astroid forces 0 < x(°,y¢/* < land hence —1 < x0,Yo < 1. 


e For all xo, yo > 0 the slope —~/¥0/xo < 0. So at all points on the astroid that are in the 
first quadrant, the tangent line has negative slope, i.e. is “leaning backwards”. 


e As xo tends to zero, yo tends to +1 and the tangent line slope tends to infinity. So at 
points on the astroid near (0, +1), the tangent line is almost vertical. 


e As yo tends to zero, x9 tends to +1 and the tangent line slope tends to zero. So at 
points on the astroid near (+1,0), the tangent line is almost horizontal. 


Here is a figure illustrating all this. 


Sure enough, as we speculated earlier, something weird does happen to the astroid when 
xX Or Yo is zero. The astroid is pointy, and does not have a tangent there. 


tC Example 215. 


2.12 « Inverse Trigonometric Functions 


One very useful application of implicit differentiation is to find the derivatives of inverse 
functions. We have already used this approach to find the derivative of the inverse of the 
exponential function — the logarithm. 

We are now going to consider the problem of finding the derivatives of the inverses 
of trigonometric functions. Now is a very good time to go back and reread Section 0.6 on 
inverse functions — especially Definition 0.6.3. Most importantly, given a function f(x), 
its inverse function f—!(x) only exists, with domain D, when f(x) passes the “horizontal 
line test”, which says that for each Y in D the horizontal line y = Y intersects the graph 
y = f(x) exactly once. (That is, f(x) is a one-to-one function.) 

Let us start by playing with the sine function and determine how to restrict the domain 
of sin x so that its inverse function exists. 


Example 2.12.1 


Let y = f(x) = sin(x). We would like to find the inverse function which takes y and re- 
turns to us a unique x-value so that sin(x) = y. 
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e For each real the number Y, the number of x-values that obey sin(x) = Y, is exactly 
the number of times the horizontal straight line y = Y intersects the graph of sin(x). 


e When —1 < Y < 1, the horizontal line intersects the graph infinitely many times. 
This is illustrated in the figure above by the line y = 0.3. 


e On the other hand, when Y < —1 or Y > 1, the line y = Y never intersects the graph 
of sin(x). This is illustrated in the figure above by the line y = —1.2. 


This is exactly the horizontal line test and it shows that the sine function is not one-to-one. 
Now consider the function 
7 


y = sin(x) with domain — 5 <x< > 


N 


This function has the same formula but the domain has been restricted so that, as we'll 
now show, the horizontal line test is satisfied. 


As we saw above when |Y| > 1 no x obeys sin(x) = Y and, for each —1 < Y < 1, the line 
y = Y (illustrated in the figure above with y = 0.3) crosses the curve y = sin(x) infinitely 
many times, so that there are infinitely many x’s that obey f(x) = sinx = Y. However 
exactly one of those crossings (the dot in the figure) has —7/2 < x < 7/2. 
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That is, for each —1 < Y < 1, there is exactly one x, call it X, that obeys both 
7 ue 
ii = ¥ d —-~<X<z 
sin an. 5 5 
That unique value, X, is typically denoted arcsin(Y). That is 
7 7 
sin(arcsin(Y)) = Y and =a arcsin(Y) < 7 


Renaming Y — x, the inverse function arcsin(x) is defined for all -1 < x < 1 and is 
determined by the equation 


sin (arcsin(x)) =x and — 5 < arcsin(x) < = (2.12.1) 


Note that many texts will use sin~'(x) to denote arcsine, however we will use arcsin(x) 
since we feel that it is clearer”, ; the reader should recognise both. 
Example 2.12.1 =) 


om Example 2 


Since 
sin i 1 sin Lo : 
2 6 2 
and —7/2 < 7/6,7/2 < 7/2, we have 
arcsin 1 = 2 arcsin : a 
~ 3 6 
Even though 
sin(277) = 0 


it is not true that arcsin0 = 27, and it is not true that arcsin ( sin(271)) = 27, because 27 
is not between —7/2 and 7/2. More generally 


arcsin (sin(x)) = the unique angle @ between —7/2 and */2 obeying sin @ = sin x 


=x ifandonlyif —7/2<x< 7/2 


So, for example, arcsin (sin (117/16) ) cannot be 117/16 because 117/16 is bigger than 7/2. So 
how do we find the correct answer? Start by sketching the graph of sin(x). 


39 The main reason being that people frequently confuse sin” !(x) with (sin(x))~! = — We feel that 
prepending the prefix “arc” less likely to lead to such confusion. The notations asin(x) and Arcsin(x) 
are also used. 
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y = sin(117/16) 


It looks like the graph of sin x is symmetric about x = 7/2. The mathematical way to say 
that “the graph of sin x is symmetric about x = 7/2” is “sin(7/2—@) = sin(”/2+ 0)” for 
all 6. That is indeed true*?. 

Now 117/16 = 7/2 + 87/16 so 


sin (FE) = sin (F +E) = sin (F - 32) - sn (FF) 


and, since 57/16 is indeed between —7/2 and 7/2, 


arcsin (sin (=) = a (and not _ 


Example 21223 


» Derivatives of inverse trig functions 


Now that we have explored the arcsine function we are ready to find its derivative. Lets 
call 


arcsin(x) = @(x), 


so that the derivative we are seeking is a The above equation is (after taking sine of both 
sides) equivalent to 
sin(0) = x 


Now differentiate this using implicit differentiation (we just have to remember that @ 
varies with x and use the chain rule carefully): 


dé 


6)-—=1 

cos(6) dx 
dé 1 : : 
a substitute 0 = arcsin x 
dx cos(@) 
; 1 

Ga cos(arcsin x) 


40 Indeed both are equal to cos @. You can see this by playing with the trig identities in Appendix A.8. 
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This doesn’t look too bad, but it’s not really very satisfying because the right hand side is 
expressed in terms of arcsin(x) and we do not have an explicit formula for arcsin(x). 


However even without an explicit formula for arcsin(x), it is a simple matter to get an 
explicit formula for cos ( arcsin(x)), which is all we need. Just draw a right-angled with 
one angle being arcsin(x). This is done in the figure below“! 


Since sin(@) = x (see (2.12.1)), we have made the side opposite the angle 6 of length x and 
the hypotenuse of length 1. Then, by Pythagoras, the side adjacent to 6 has length V1 — x? 
and so 


cos (arcsin(x)) = cos(@) = V/1— x? 


which in turn gives us the answer we need: 


The definitions for arccos and arctan are developed in the same way. Here are the 
graphs that are used. 


41 The figure is drawn for the case that 0 < arcsin(x) < 7/2. Virtually the same argument works for the d 


case —7/2 < arcsin(x) <0 
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The definitions for the remaining three inverse trigonometric functions may also be de- 
veloped in the same way. But it’s a little easier to use 


1 
- secx = —— cotx = 
sin x cos x tan x 


csc x = 
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arcsin x is defined for |x| < 1. It is the unique number obeying 
‘ : 7U : 
sin (arcsin(x)) = x and =5 arcsin(x) < 
arccos x is defined for |x| < 1. It is the unique number obeying 


cos (arccos(x)) = x and 0 <arccos(x) < 7 


arctan x is defined for all x € R. It is the unique number obeying 


tan (arctan(x)) = x and _ <arctan(x) < - 


arccsc X = arcsin 1 is defined for |x| > 1. It is the unique number obeying 


esc ( arccse(x)) = x and Ee ancese (a= 


2 2 


arcsec x = arccos + is defined for |x| > 1. It is the unique number obeying 
sec ( arcsec(x)) = x and 0 <arcsec(x) < 7 


arccot x = arctan + is defined for all 0 4 x € R. It is the unique number obeying 


ot | arccot(.)))— o and _ <arccot(x) < - 


Example 2.12.4 


To find the derivative of arccos we can follow the same steps: 


e Write arccos(x) = (x) so that cos @ = x and the desired derivative is 2¢. 


e Differentiate implicitly, remembering that 6 is a function of x: 


dé 
a | 
sing 
do 1 
dx _— sin@ 
1 
ao sin(arccos x) | 


e To simplify this expression, again draw the relevant triangle 
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from which we see 


sin(arccos x) = sin@ = V/1— x. 


e Thus 
arccos x = — 
dx VT =x? 
Example 21241 
Example 2.12.5 
Very similar steps give the derivative of arctan x: 
e Start with @ = arctan x, so tan@ = x. 
e Differentiate implicitly: 
dé 
2 
é6—=1 
acre 
dé f 2 
— = = 6 
dx se2o. 


d f 
— arctan x = cos‘ (arctan x). 


dx 


e To simplify this expression, we draw the relevant triangle 


from which we see 


1 


cos? (arctan x) = cos? @ = 


14 x2 
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e Thus 


— arctan x = : 
dx 14 x2 


tC Example 2.12.5 __J 


Example 2.12.6 


To find the derivative of arccsc we can use its definition and the chain rule. 


6 = arccsc x take cosecant of both sides 


1 
cscO = x but csc @ = ——,, so flip both sides 
sin 0 


1 
sin@ = — 
x 


1 
6 = arcsin (:) 
x 


Now just differentiate: 


now take arcsine of both sides 


« _ aa arcsin (5) chain rule carefully 
1 —1 
Vioxx © 


To simplify further we will factor x~? out of the square root. We need to be a little careful 


doing that. Take another look at examples 1.5.6 and 1.5.7 and the discussion between them 
before proceeding. 


7 1 -1 
aie (ae 1) 
1 —1 
— =. (ae a note that x? - |x~!| = |x]. 
_ 1 
|x|\vx2 —1 


tC Example 2126. 


In the same way we can find the derivatives of the remaining two inverse trig functions. 
We just use their definitions, some derivatives we already know and the chain rule. 


2 ee es © areces (=) = : ( : ) = ‘ 

dx ~ dx z V1—- V2 ae |x|v/x2 — 1 
d d 1 1 1 1 

dx arccot(x) = qx arctan (=) Spey ( =) ia 


By way of summary, we have 
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Theorem 2.12.7. 


The derivatives of the inverse trigonometric functions are 


2.13 4 The Mean Value Theorem 


Consider the following situation. Two towns are separated by a 120km long stretch of 
road. The police in town A observe a car leaving at lpm. Their colleagues in town B 
see the car arriving at 2pm. After a quick phone call between the two police stations, 
the driver is issued a fine for going 120km/h at some time between 1pm and 2pm. It 
is intuitively obvious** that, because his average velocity was 120km/h, the driver must 
have been going at least 120km/h at some point. From a knowledge of the average velocity 
of the car, we are able to deduce something about an instantaneous velocity*’. 

Let us turn this around a little bit. Consider the premise of a 90s action film — a bus 
must travel at a velocity of no less than 80km/h. Being a bus, it is unable to go faster than, 
say, 120km/h. The film runs for about 2 hours, and lets assume that there is about thirty 
minutes of non-action — so the bus’ velocity is constrained between 80 and 120km/h for 
a total of 1.5 hours. 

It is again obvious that the bus must have travelled between 80 x 1.5 = 120 and 120 x 
1.5 = 180km during the film. This time, from a knowledge of the instantaneous rate of 
change of position — the derivative — throughout a 90 minute time interval, we are able 
to say something about the net change of position during the 90 minutes. 

In both of these scenarios we are making use of a piece of mathematics called the 
Mean Value Theorem. It says that, under appropriate hypotheses, the average rate of 
change LO) f(a) @) 
rate of change f’(c) of the function at some*’ (unknown) point a < c < b. We shall get to 


42 Unfortunately there are many obvious things that are decidedly false — for example “There are more d 


rational numbers than integers.” or “Viking helmets had horns on them”. 
43 Recall that speed and velocity are not the same. 


of a function over an interval is achieved exactly by the instantaneous 


e Velocity specifies the direction of motion as well as the rate of change. Objects moving along a straight 
line have velocities that are positive or negative numbers indicating which direction the object is 
moving along the line. 


e Speed, on the other hand, is the distance travelled per unit time and is always a non-negative number 
— it is the absolute value of velocity. 


44 The sequel won a Raspberry award for “Worst remake or sequel”. 
45 There must be at least one such point — there could be more than one — but there cannot be zero. 
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a precise statement in Theorem 2.13.4. We start working up to it by first considering the 
special case in which f(a) = f(b). 


» Rolle’s theorem 


Theorem 2.13.1 (Rolle’s theorem). 


Let a and b be real numbers with a < b. And let f be a function so that 
e f(x) is continuous on the closed interval a <x <b, 
e f(x) is differentiable on the open interval a < x < b, and 
an) Sy (?) 


then there is a c strictly between a and D, i.e. obeying a < c < b, such that 


fi(c) =0. 


Again, like the two scenarios above, this theorem says something intuitively obvious. 
Consider — if you throw a ball straight up into the air and then catch it, at some time in 
between the throw and the catch it must be stationary. Translating this into mathematical 
statements, let s(t) be the height of the ball above the ground in metres, and let t be time 
from the moment the ball is thrown in seconds. Then we have 


s(0) =1 we release the ball at about hip-height 
s(4) =1 we catch the ball 4s later at hip-height 


Then we know there is some time in between — say at t = c — when the ball is stationary 
(in this case when the ball is at the top of its trajectory). ie 


oc) =s (c) =0, 


Rolle’s theorem guarantees that for any differentiable function that starts and ends at the 
same value, there will always be at least one point between the start and finish where the 
derivative is zero. 


Figure 2.13.1. 
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There can, of course, also be multiple points at which the derivative is zero — but there 
must always be at least one. Notice, however, the theorem“ does not tell us the value of 
c, just that such a c must exist. 


co Example 2.13.2 


We can use Rolle’s theorem to show that the function 


f(x) = sin(x) — cos(x) 


has a point c between 0 and 3% so that f’(c) = 0. 
To apply Rolle’s theorem we first have to show the function satisfies the conditions of 
the theorem on the interval (0, 37]. 


e Since f is the sum of sine and cosine it is continuous on the interval and also differ- 
entiable on the interval. 


e Further, since 


f(0) =sin0-—cos0 =0-1=-1 


3 3 3 
f(F) =n cos —* = 1=0]}=1 


we can now apply Rolle’s theorem. 
e Rolle’s theorem implies that there must be a point c € (0,37t/2) so that f’(c) = 0. 


While Rolle’s theorem doesn’t tell us the value of c, this example is sufficiently simple that 
we can find it directly. 


f'(x) = cosx +sinx 


f'(c) =cosc+sinc = 0 rearrange 
sinc = —cosc and divide by cosc 
tanc = —1 


Hence c = a We have sketched the function and the relevant points below. 


sin(xz) — cos(x) 


46 Notice this is very similar to the intermediate value theorem (see Theorem 1.6.12) \ 
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Ce ampie 232, 


A more substantial application of Rolle’s theorem (in conjunction with the intermediate 
value theorem — Theorem 1.6.12) is to show that a function does not have multiple zeros 
in an interval: 


Example 2.13.3 


Show that the equation 2x — 1 = sin(x) has exactly 1 solution. 


e Start with a rough sketch of each side of the equation 


2x —1 


sin x 


This seems like it should be true. 


e Notice that the problem we are trying to solve is equivalent to showing that the 
function 


f(x) = 2x —1-sin(x) 
has only a single zero. 


e Since f(x) is the sum of a polynomial and a sine function, it is continuous and dif- 
ferentiable everywhere. Thus we can apply both the IVT and Rolle’s theorem. 


e Notice that f(0) = —1 and f(2) = 4—1-sin(2) = 3-sin(2) > 2, since -1 < 
sin(2) < 1. Thus by the IVT we know there is at least one number c between 0 and 2 
so that f(c) = 0. 


e But our job is only half done — this shows that there is at least one zero, but it does 
not tell us there is no more than one. We have more work to do, and Rolle’s theorem 
is the tool we need. 


e Consider what would happen if f(x) is zero in 2 places — that is, there are numbers 


a,b so that f(a) = f(b) =0. 


- Since f(x) is differentiable everywhere and f(a) = f(b) = 0, we can apply 
Rolle’s theorem. 


- Hence we know there is a point c between a and b so that f’(c) = 0. 
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— But let us examine f’(x): 
f'(x) =2-—cosx 


Since —1 < cosx < 1, we must have that f’(x) > 1. 


- But this contradicts Rolle’s theorem which tells us there must be a point at 
which the derivative is zero. 


Thus the function cannot be zero at two different places — otherwise we’d have a 
contradiction. 


We can actually nail down the value of c using the bisection approach we used in exam- 
ple 1.6.15. If we do this carefully we find that c ~ 0.887862... 


= 2.13.3 __J 


» Back to the MVT 


Rolle’s theorem can be generalised in a straight-forward way; given a differentiable func- 


tion f(x) we can still say something about , even if f(a) 4 f(b). Consider the following 
sketch: 


Figure 2.13.2. 


All we have done is tilt the picture so that f(a) < f(b). Now we can no longer guaran- 
tee that there will be a point on the graph where the tangent line is horizontal, but there 
will be a point where the tangent line is parallel to the secant joining (a, f(a)) to (b, f(b)). 

To state this in terms of our first scenario back at the beginning of this section, suppose 
that you are driving along the x-axis. At time t = a you are at x = f(a) and at time t = b 
you are at x = f(b). For simplicity, let’s suppose that b > a and f(b) > f(a), just like in 
the above sketch. Then during the time interval in question you travelled a net distance of 
f(b) — f(a). It took you b — a units of time to travel that distance, so your average velocity 


f(b)—f la) £(b)-F(a) 


was —;—,—. You may very well have been going faster than ~+~—— part of the time 
and slower than fey fla) part of the time. But it is reasonable to guess that at some time 


f(b) fla) 


between f = a and t = b your instantaneous velocity was exactly “~—-~. The mean 
value theorem says that, under reasonable assumptions about f, this is indeed the case. 


191 


DERIVATIVES 2.13 THE MEAN VALUE THEOREM 


Theorem 2.13.4 (The mean value theorem). 


Let a and b be real numbers with a < b. And let f(x) be a function so that 
e f(x) is continuous on the closed interval a < x < b, and 
e f(x) is differentiable on the open interval a < x < b 


then there is ac € (a,b), such that 


which we can also express as 


f(b) = fla) + fi(c)(b— a). 


Let us start to explore the mean value theorem — which is very frequently known as 
the MVT. A simple example to start: 


Example 2.13.5 


Consider the polynomial f(x) = 3x2 — 4x + 2 on [—1, 1]. 


e Since f is a polynomial it is continuous on the interval and also differentiable on the 
interval. Hence we can apply the MVT. 


e The MVT tells us that there is a point c € (—1,1) so that 


(peat) =i) 2 te? .. 
fae 


This example is sufficiently simple that we can find the point c and the corresponding 
tangent line: 


e The derivative is 


f'(x) =6x-4 


e So we need to solve f’(c) = —4: 
6c—-4= —4 
which tells us that c = 0. 


e The tangent line has slope —4 and passes through (0, f(0)) = (0,2), and so is given 
by 


y= —4x+2 
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e The secant line joining (—1, f(—1)) = (—1,9) to (1, f(1)) = (1,1) is just 
y=5—4x 


e Here is a sketch of the curve and the two lines: 


— 3x2-—4x+2 
— 5 —4x 
— 2—4x 


Cape 2135 
co Example 2.13.6 


We can return to our initial car-motivated examples. Say you are driving along a straight 
road in a car that can go at most 80km/h. How far can you go in 2 hours? — the answer is 
easy, but we can also solve this using MVT. 


e Let s(t) be the position of the car in km at time t measured in hours. 


Then s(0) = 0 and s(2) = q, where q is the quantity that we need to bound. 


We are told that |s’(t)| < 80, or equivalently 
—80 <s'(t) < 80 


By the MVT there is some c between 0 and 2 so that 
—0 


2 
e Now since —80 < s’(c) < 80 we must have —80 < q/2 < 80 and hence —160 < gq = 
s(2) < 160. 


Campi 213.6} J 


More generally if we have some information about the derivative, then we can use the 
MVT to leverage this information to tell us something about the function. 


Example 2.13.7 
Let f(x) be a differentiable function so that 
f(1) = 10 and —1 < f'(x) < 2 everywhere 


Obtain upper and lower bounds on f (5). 
Okay — what do we do? 
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e Since differentiable we can use the MVT. 


e Say f(5) = q, then the MVT tells us that there is some c between 1 and 5 such that 


pee Ge Oe G10 


e But we know that —1 < f’(c) < 2,so 


e Thus we must have 6 < f(5) < 18. 


Example 2137}, 


>>> (optional) — Why is the MVT true 


We won’t give a real proof for this theorem, but we’ll look at a picture which shows why 
it is true. Here is the picture. It contains a sketch of the graph of f(x), with x running from 
a to b, as well as a line segment which is the secant of the graph from the point (a, f(a)) 


to the point (b, f(b)). The slope of the secant is exactly Mf) Remember that we are 


looking for a point, (c, f(c)), on the graph of f(x) with the property that f’(c) = Uy 


i.e. with the property that the slope of the tangent line at (c, f(c)) is the same as the slope 
of the secant. So imagine that you start moving the secant upward, carefully keeping 
the moved line segment parallel to the secant. So the slope of the moved line segment is 
always exactly Oe) When we first start moving the line segment it is not tangent to 
the curve — it crosses the curve. This is illustrated in the figure by the second line segment 
from the bottom. If we move the line segment too far it does not touch the curve at all. 
This is illustrated in the figure by the top segment. But if we stop moving the line segment 
just before it stops intersecting the curve at all, we get exactly the tangent line to the curve 
at the point on the curve that is farthest from the secant. This tangent line has exactly the 
desired slope. This is illustrated in the figure by the third line segment from the bottom. 
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» Be careful with hypotheses 


The mean value theorem has hypotheses — f(x) has to be continuous for a < x < b and 
has to be differentiable for a < x < b. If either hypothesis is violated, the conclusion of the 
mean value theorem can fail. That is, the curve y = f(x) need not have a tangent line at 
some x = c between a and b whose slope, f’(c), is the same as the slope, Ae@). of the 
secant joining the points (a, f(a)) and (b, f(b)) on the curve. If f’(x) fails to exist for even 
a single value of x between a and J, all bets are off. The following two examples illustrate 


this. 
Example 2.13.8 


For the first “bad” example, a = 0, b = 2 and 


Orie 4 (7) 
Lr 1 ifx>1 
(a, f(a)) 


For this example, f’(x) = 0 at every x where is is defined. That is, at every x 4 1. But the 
slope of the secant joining (a, f(a)) = (0,0) and (b, f(b)) = (2,1) is 5. 


Example 2138} J 


Example 2.13.9 
For the second “bad” example, a = —1, b = Land f(x) = |x|. For this function 
f'(x) = 4 undefined if x =0 
1 ifx > 0 


For this example, f’(x) = +1 at every x where is is defined. That is, at every x 4 0. But 
the slope of the secant joining (a, f(a)) = (—1,1) and (b, f(b)) = (1,1) is 0. 


Example 2.13.9 __J 


am Example 2.13.10 


Here is one “good” example, where the hypotheses of the mean value theorem are satis- 
fied. Let f(x) = x”. Then f’(x) = 2x. For any a < b, 


fO)-f@ _B-2 | 


b—a b—a hie 


So f’(c) = 2c is exactly fe) fa) when c = 44", which, in this example, happens to be 
exactly half way between x = a and x = b. 
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Example 213.10} J 


A simple consequence of the mean value theorem is that if you know the sign of f’(c) 
for all c’s between a and b, with b > a, then f(b) — f(a) = f’(c)(b — a) must have the same 
sign. 


Corollary 2.13.11 (Consequences of the mean value theorem). 


Let A and B be real numbers with A < B. Let function f(x) be defined and 
continuous on the closed interval A < x < B and be differentiable on the open 
interval A < x < B. 


(a) If f’(c) = 0 for all A < c < B, then f(b) = f(a) forall A <a<b<B. 
— That is, f(x) is constanton A < x < B. 


(b) If f’(c) > 0 for all A <c < B, then f(b) 
— That is, f(x) is increasing on A < x 


(a) forall A<a<b<B. 


2 
< 


(c) If f’(c) < 0 forall A <c < B, then f(b) 
— That is, f(x) is decreasing on A < x 


< f(a) forall A<a<b<B. 
< 


It is not hard to see why the above is true: 


e Say f’(x) = 0 at every point in the interval [A, B]. Now pick any a,b € [A,B] with 
a < b. Then the MVT tells us that there is c € (a,b) so that 


pe) =f =f0 


If f(b) # f(a) then we must have that f’(c) #4 0 — contradicting what we are told 
about f’(x). Thus we must have that f(b) = f(a). 


e Similarly, say f’(x) > 0 at every point in the interval [A,B]. Now pick any a,b € 
[A, B] with a < b. Then the MVT tells us that there is c € (a,b) so that 


pe) = fO=L0 


Since b > a, the denominator is positive. Now if f(b) < f(a) the numerator would 
be negative, making the right-hand side negative, and contradicting what we are 
told about f’(x). Hence we must have f(b) > f(a). 
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A nice corollary of the above corollary is the following: 


Corollary 2.13.12. 


If f’(x) = 9’(x) for all x in the open interval (a,b), then f — g is a constant on 


(a,b). That is f(x) = g(x) +c, where c is some constant. 


We can prove this by setting h(x) = f(x) — g(x). Then h’(x) = 0 and so the previous 
corollary tells us that h(x) is constant. 


Example 2.13.13 


Using this corollary we can prove results like the following: 


arcsin X + arccos X = > forall -l<x<1 
How does this work? Let f(x) = arcsin x + arccos x. Then 
1 —1 


0 


7 
YS) = } — 
fe) Vl-x° V1-x 
Thus f must be a constant. To find out which constant, we can just check its value at a 
convenient point, like x = 0. 


arcsin(0) + arccos(0) = 27/2+0= 7/2 


Since the function is constant, this must be the value. 


aka Example 213.13} J 


2.14 « Higher order derivatives 


The operation of differentiation takes as input one function, f(x), and produces as out- 
put another function, f’(x). Now f’(x) is once again a function. So we can differentiate 
it again, assuming that it is differentiable, to create a third function, called the second 
derivative of f. And we can differentiate the second derivative again to create a fourth 
function, called the third derivative of f. And so on. 


Notation 2.14.1. 


e f"(x) and f® (x) and ef (x) all mean £ (4 f(x)) 


; fl" (x) and f(x) and f(x) all mean 2 (L(£f(x))) 


e f(x) and af (x) both mean dd (a 


e and so on. 
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Here is a simple example. Then we’ll think a little about the significance of second 
order derivatives. Then we'll do a more a computationally complex example. 


Example 2.14.2 


Let n be a natural number and let f(x) = x”. Then 


<x" = 
2 
oe (nx) =nn—1)x""? 
3 
a = “(n(n 1)x"-?) = n(n —1)(n-—2)x”? 


Each time we differentiate, we bring down the exponent, which is exactly one smaller 
than the previous exponent brought down, and we reduce the exponent by one. By the 
time we have differentiated n — 1 times, the exponent has decreased to n— (n—1) =1 
and we have brought down the factors n(n — 1)(n — 2)---2. So 


d’-1 
Pras =n(n—1)(n—2)---2x 
and 
d” 
ane =n(n—1)(n—2)---1 
The product of the first n natural numbers, 1-2-3-----n, is called “n factorial” and is 


denoted n!. So we can also write 


If m > n, then 


tC Example 2.14.2 __J 
= Example 2.14.3 


Recall that the derivative v’(a) is the (instantaneous) rate of change of the function v(t) at 
t = a. Suppose that you are walking on the x-axis and that x(t) is your x-coordinate 
at time t. Also suppose, for simplicity, that you are moving from left to right. Then 
v(t) = x'(t) is your velocity at time t and v'(a) = x’(a) is the rate at which your ve- 
locity is changing at time t = a. It is called your acceleration. In particular, if x”(a) > 0, 
then your velocity is increasing, i.e. you are speeding up, at time a. If x”(a) < 0, then your 
velocity is decreasing, i.e. you are slowing down, at time a. That’s one interpretation of 
the second derivative. 
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tC Example 21431) 
om Example 2.14.4 (Example 2.11.1, gone ) 


Find y” ify = ye +xy+2°. 


Solution. This problem concerns some function y(x) that is not given to us explicitly. All 
that we are told is that y(x) satisfies 


y(x) = y(x)? + xy(x) +2° (2.14.1) 


for all x. We are asked to find y’(x). We cannot solve this equation to get an explicit 
formula for y(x). So we use implicit differentiation, as we did in Example 2.11.1. That is, 
we apply a to both sides of (2.14.1). This gives 


y'(x) = 3y(x)?y'(x) + y(x) + xy!) + 3x° (2.14.2) 
which we can solve for y'(x), by moving all y’(x)’s to the left hand side, giving 
[1 — x — 3y(x)"]y'(x) = y(x) +3x° 


and then dividing across. 


ry W(x) + 3x? 
1) a ee aya) (2.14.3) 
To get y”(x), we have two options. 
Method 1. Apply a to both sides of (2.14.2). This gives 
y! (x) = 3y(x)? y" (x) + 6y(x) y (x)? + 2y/(x) + xy" (x) + 6x 
We can now solve for y"(x), giving 
ny — OX +2y'(x) + by(x)y (x)? 
y'(x) = (22 3yP (2.14.4) 
Then we can substitute in (2.14.3), giving 
L(x) +3x7 (x)+3x2 \2 
nx) =f Eeaee tO) (EeaGp) 
FT 1 — x —3y(x)2 
2 i) 
_ 3x [1 = x = 3y(x)?7T + [y(%) + 32°] [1 — x — 3y(x)"] + 3y(™)[y(x) + 3x7] 
[1 — x — 3y(x)?]° 
Method 2. Alternatively, we can also differentiate (2.14.3). 
faye ly’ (x) + 6x][1 — x — 3y(x)?] — [y(x) +327] [-1 — 6y(x)y'(x)] 
[1 — x —3y(x)]? 
x x2 x x2 
Reg + 61 [ — x — 3y(a)] — y) + 3x7] [-1 — 6) Eel 


[1 — x — 3y(x)]? 


230 =e 3y(x)"] 4 6x[l— a= 3y(x)2]° + 6y(x)[y(x) + 3x2]? 
[1 — x — 3y(x)?]° 
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Remark 1. We have now computed y"(x) — sort of. The answer is in terms of y(x), which 
we don’t know. Since we cannot get an explicit formula for y(x), there’s not a great deal 
that we can do, in general. 

Remark 2, Even though we cannot solve y = y* + xy + x° explicitly for y(x), for general 
x, it is sometimes possible to solve equations like this for some special values of x. In 
fact, we saw in Example 2.11.1 that when x = 1, the given equation reduces to y(1) = 
y(1)? +1-y(1) + 1°, or y(1)3 = —1, which we can solve to get y(1) = —1. Substituting 
into (2.14.2), as we did in Example 2.11.1 gives 


aif 2 


4) = 
YO) = qa 3 
and substituting into (2.14.4) gives 


nay = OFA 3) + 6( 1)( 3)° 6=3=5 2 

a (1-712 >" aa SB 

(It’s a fluke that, in this example, y’(1) and y’(1) happen to be equal.) So we now know 
that, even though we can’t solve y = y*° + xy + x° explicitly for y(x), the graph of the so- 
lution passes through (1, —1) and has slope —3 (i.e. is sloping downwards by between 30° 
and 45°) there and, furthermore, the slope of the graph decreases as x increases through 
x=1. 


Here is a sketch of the part of the graph very near (1, —1). The tangent line to the graph at 
(1, —1) is also shown. Note that the tangent line is sloping down to the right, as we expect, 
and that the graph lies below the tangent line near (1,—1). That’s because the slope f’(x) 
is decreasing (becoming more negative) as x passes through 1. 


( Example 2144, 


Warning 2.14.5. 


Many people will suppress the (x) in y(x) when doing computations like those 
y+3x? 
1—x—3y?" 
If you do this, you must never forget that y is a function of x and is not a constant. 


in Example 2.14.4. This gives shorter, easier to read formulae, like y/ = 


If you do forget, you'll make the very serious error of saying that ad = 0, which 
is false. 


200 


APPLICATIONS OF DERIVATIVES 


In Section 2.2 we defined the derivative at x = a, f'(a), of an abstract function f(x), to be 
its instantaneous rate of change at x = a: 


(0) = tim £2 


x—a x—Aa 


This abstract definition, and the whole theory that we have developed to deal with it, 
turns out be extremely useful simply because “instantaneous rate of change” appears in a 
huge number of settings. Here are a few examples. 


e If you are moving along a line and x(t) is your position on the line at time t, then 
your rate of change of position, x’(t), is your velocity. If, instead, v(t) is your velocity 
at time ¢, then your rate of change of velocity, v'(t), is your acceleration. We shall 
explore this further in Section 3.1. 

e If P(t) is the size of some population (say the number of humans on the earth) at 
time t, then P’(t) is the rate at which the size of that population is changing. It is 
called the net birth rate. We shall explore it further in Section 3.3.3. 

e Radiocarbon dating, a procedure used to determine the age of, for example, archae- 
ological materials, is based on an understanding of the rate at which an unstable 
isotope of carbon decays. We shall look at this procedure in Section 3.3.1 

e A capacitor is an electrical component that is used to repeatedly store and release 
electrical charge (say electrons) in an electronic circuit. If Q(t) is the charge on a ca- 
pacitor at time t, then Q’(t) is the instantaneous rate at which charge is flowing into 
the capacitor. That’s called the current. The standard unit of charge is the coulomb. 
One coulomb is the magnitude of the charge of approximately 6.241 x 10'8 electrons. 
The standard unit for current is the amp. One amp represents one coulomb per sec- 
ond. 
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3.1 4 Velocity and acceleration 


If you are moving along the x-axis and your position at time t is x(t), then your velocity 
at time t is v(t) = x(t) and your acceleration at time t is a(t) = v’(t) = x" (t). 


Example 3.1.1 


Suppose that you are moving along the x-axis and that at time t your position is given by 
iat = Bi. 


We're going to try and get a good picture of what your motion is like. We can learn quite 
a bit just by looking at the sign of the velocity v(t) = x’(t) at each time t. 


e If x’(t) > 0, then at that instant x is increasing, i.e. you are moving to the right. 
e If x’(t) = 0, then at that instant you are not moving at all. 
e If x’(t) < 0, then at that instant x is decreasing, i.e. you are moving to the left. 
From the given formula for x(t) it is straight forward to work out the velocity 
o(t) = x'(t) = 3 —3 = 3(f —1) =3(t+1)(t-1) 
This is zero only when t = —1 and when t = +1; at no other value! of t can this polyno- 
mial be equal zero. Consequently in any time interval that does not include either t = —1 


or t = +1, v(t) takes only a single sign*. So 


e For allt < —1, both (t+ 1) and (t — 1) are negative (sub in, for example, t = —10) so 
the product v(t) = x’(f) = 3(t+1)(t-1) > 0. 


e For all —1 < ¢ < 1, the factor (f+1) > 0 and the factor (t-1) < 0 (sub in, for 
example, t = 0) so the product v(t) = x(t) = 3(t +1)(t-1) < 0. 


e For all t > 1, both (f+ 1) and (t — 1) are positive (sub in, for example, t = +10) so 
the product v(t) = x’(t) = 3(t+1)(t-1) > 0. 


The figure below gives a summary of the sign information we have about t — 1, + 1 and 


This is because the equation ab = 0 is only satisfied for real numbers a and b when either a = 0 or b = 0 
or both a = b = 0. Hence if a polynomial is the product of two (or more) factors, then it is only zero 
when at least one of those factors is zero. There are more complicated mathematical environments in 
which you have what are called “zero divisors” but they are beyond the scope of this course. 

2 This is because if v(t,) < 0 and v(t,) > 0 then, by the intermediate value theorem, the continuous 
function v(t) = x’(t) must take the value 0 for some t between t, and tp. 
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It is now easy to put together a mental image of your trajectory. 


e For t large and negative (i.e. far in the past), x(t) is large and negative and v(t) is 
large and positive. For example*, when t = —10°, x(t) ~ # = —10'8 and v(t) ~ 
3t2 = 3-10. So you are moving quickly to the right. 


e Fort < —1, v(t) = x(t) > 0 so that x(t) is increasing and you are moving to the 
right. 


e Att = —1, v(—1) = O and you have come to a halt at position x(—1) = (—1)? — 
3(-1) +2 =4. 


e For —1<t<1,v(t) = x(t) < 0so that x(t) is decreasing and you are moving to the 
left. 


e Att = +1, 0(1) = 0 and you have again come to a halt, but now at position x(1) = 
1-3+4+2=0. 


e Fort > 1,0(t) = x/(t) > 0so that x(t) is increasing and you are again moving to the 
right. 


e For ¢ large and positive (i.e. in the far future), x(t) is large and positive and v(t) is 
large and positive. For example’, when t = 10°, x(t) ~ & = 10!8 and o(t) = 3f = 
3-10!*. So you are moving quickly to the right. 


Here is a sketch of the graphs of x(t) and v(t). The heavy lines in the graphs indicate 
when you are moving to the right — that is where v(t) = x’(t) is positive. 


3 Notice here we are using the fact that when t is very large f° is much bigger than f? and t!. So we can 4 


approximate the value of the polynomial x(t) by the largest term — in this case f?. We can do similarly 
with v(t) — the largest term is 3??. 
4 Weare making a similar rough approximation here. 
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Example J 


Example 3.1.2 


In this example we are going to figure out how far a body falling from rest will fall in a 
given time period. 


e We should start by defining some variables and their units. Denote 


— time in seconds by t, 
— mass in kilograms by m, 


- distance fallen (in metres) at time f by s(t), velocity (in m/sec) by v(t) = s(t) 
and acceleration (in m/sec?) by a(t) = v/(t) = s(t). 


It makes sense to choose a coordinate system so that the body starts to fall at t = 0. 
e We will use Newton’s second law of motion 
the force applied to the body at time t = m-a(f). 


together with the assumption that the only force acting on the body is gravity (in 
particular, no air resistance). Note that near the surface of the Earth, 


the force due to gravity acting on a body of mass m = m- g. 


The constant g, called the acceleration of gravity”, is about 9.8m/sec?. 


5 It is also called the standard acceleration due to gravity or standard gravity. For those of you who , 


prefer imperial units (or US customary units), it is about 32 ft/ sec2, 77165 cubits/minute2, or 631353 
furlongs/ hour’. 
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e Since the body is falling from rest, we know that its initial velocity is zero. That is 
v(0) = 0. 
Newton’s second law then implies that 
m-a(t) = force due to gravity 
m-v(t)=m-g cancel the m 
UM) = 
e In order to find the velocity, we need to find a function of t whose derivative is 
constant. We are simply going to guess such a function and then we will verify 


that our guess has all of the desired properties. It’s easy to guess a function whose 
derivative is the constant g. Certainly gt has the correct derivative. So does 


v(t) =gtte 


for any constant c. One can then verify® that v’(t) = g. Using the fact that v(0) = 0 
we must then have c = 0 and so 


e Since velocity is the derivative of position, we know that 


S(t) Ss ert 


To find s(t) we are again going to guess and check. Its not hard to see that we can 
use 

s(t) = ti +c 
where again c is some constant. Again we can verify that this works simply by 
differentiating’. Since we have defined s(t) to be the distance fallen, it follows that 
s(0) = 0 which in turn tells us that c = 0. Hence 


sf) = gy but g = 9.8, so 


=A9f, 


which is exactly the s(t) used way back in Section 1.2. 


While it is clear that this satisfies the equation we want, it is less clear that it is the only function that 
works. To see this, assume that there are two functions f(t) and h(t) which both satisfy v/(t) = g. Then 
f'(t) =h'(t) = gand so f’(t) —h’(t) = 0. Equivalently 


The only function whose derivative is zero everywhere is the constant function (see Section 2.13 and 
Theorem 2.13.11). Thus f(t) — h(t) = constant. So all the functions that satisfy v’(f) = g must be of the 
form gt + constant. 

7 Toshow that any solution of s’(t) = gv must be of this form we can use the same reasoning we used to 
get v(t) = gt + constant. 
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tC Example 3.1.2 __J 


Let’s now do a similar but more complicated example. 


im Example 3.1.3 


A car’s brakes can decelerate the car at 64000km/ hr*. How fast can the car be driven if it 
must be able to stop within a distance of 50m? 


Solution. Before getting started, notice that there is a small “trick” in this problem — 
several quantities are stated but their units are different. The acceleration is stated in 
kilometres per hour’, but the distance is stated in metres. Whenever we come across a 
“real world” problem® we should be careful of the units used. 


e We should first define some variables and their units. Denote 
— time (in hours) by t, 
— the position of the car (in kilometres) at time t by x(t), and 


— the velocity (in kilometres per hour) by is v(t). 


We can also choose a coordinate system such that x(0) = 0 and the car starts braking 
at time t = 0. 


e Now let us rewrite the information in the problem in terms of these variables. 


- Weare told that, at maximum braking, the acceleration v’(t) = x’(t) of the car 
is —64000. 


— We need to determine the maximum initial velocity v(0) so that the stopping 
distance is at most 50m = 0.05km (being careful with our units). Let us call the 
stopping distance Xstop which is really x(tstop) where tstop is the stopping time. 


e In order to determine xs) we first need to determine tstop, which we will do by 
assuming maximum breaking from a, yet to be determined, initial velocity of v(0) = 
q m/sec. 


e Assuming that the car undergoes a constant acceleration at this maximum breaking 
power, we have 


o'(t) = —64000 


This equation is very similar to the ones we had to solve in Example 3.1.2 just above. 


As we did there’, we are going to just guess v(t). First, we just guess one func- 
tion whose derivative is —64000, namely —64000t. Next we observe that, since the 
derivative of a constant is zero, any function of the form 


v(t) = —64000t + c 


8 Well — “realer world” would perhaps be a betterer term. d 


9 Nowisa good time to go back and have a read of that example. 
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with constant c, has the correct derivative. Finally, the requirement that the initial 
velocity v(0) = q” forces c = q, so 


v(t) = q — 64000 t 
e From this we can easily determine the stopping time tstop, when the initial velocity 
is q, since this is just when v(t) = 0: 


0 = v(tstop) = q — 64000 - tstop and so 


tstop = ee 
P ~ 64000 
e Armed with the stopping time, how do we get at the stopping distance? We need to 
find the formula satisfied by x(t). Again (as per Example 3.1.2) we make use of the 
fact that 
x'(t) = v(t) = q — 64000¢. 
So we need to guess a function x(t) so that x’(t) = q — 64000¢. It is not hard to see 
that 
x(t) = qt — 32000#7 + constant 
works. Since we know that x(0) = 0, this constant is just zero and 


x(t) = qt — 3200087. 


e We are now ready to compute the stopping distance (in terms of the, still yet to be 
determined, initial velocity q): 


Xstop = x(tstop) = Gtstop — 32000t5 10» 
_ 3200097 
64000 640002 
__# (;_1 
64000 y} 


a 


~ 2 x 64000 
Notice that the stopping distance is a quadratic function of the initial velocity — if 
you go twice as fast, you need four times the distance to stop. 


e But we are told that the stopping distance must be less than 50m = 0.05km. This 
means that 
Xstop = r < : 
Pp 2x 64000 ~ 100 
2 2x 64000 x5 _ 64000 x 10 
TS 700 «00 


Thus we must have q < 80. Hence the initial velocity can be no greater than 80km/h. 


Campi 313} 
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3.2 « Related rates 


Consider the following problem 


A spherical balloon is being inflated at a rate of 13cm*/sec. How fast is the 
radius changing when the balloon has radius 15cm? 


There are several pieces of information in the statement: 


e The balloon is spherical 


e The volume is changing at a rate of 13cm*/sec — so we need variables for volume 
(in cm?) and time (in sec). Good choices are V and t. 


e Weare asked for the rate at which the radius is changing — so we need a variable for 
radius and units. A good choice is r, measured in cm — since volume is measured 
in cm. 

Since the balloon is a sphere we know! that 
3 


4 
Vee 
3 


Since both the volume and radius are changing with time, both V and r are implicitly 
functions of time; we could really write 


VO srr(t) 


We are told the rate at which the volume is changing and we need to find the rate at which 
the radius is changing. That is, from a knowledge of av find the related rate!" dr 
In this case, we can just differentiate our equation by t to get a 


dV ,dr 
ae = A7r di 
This can then be rearranged to give 
dr 1 dV 
dt 47r2 dt 
Now we were told that dv = 13, so 
dr 13 
dt 47r2” 
We were also told that the radius is 15cm, so at that moment in time 
dr 13 
dt 74x 152" 


This is a very typical example of a related rate problem. This section is really just a 
collection of problems, but all will follow a similar pattern. 


10 If you don’t know the formula for the volume of a sphere, now is a good time to revise by looking at d 


Appendix A.11. 
11 Related rate problems are problems in which you are given the rate of change of one quantity and are 
to determine the rate of change of another, related, quantity. 
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e The statement of the problem will tell you quantities that must be related (above it 
was volume, radius and, implicitly, time). 


e Typically a little geometry (or some physics or...) will allow you to relate these 
quantities (above it was the formula that links the volume of a sphere to its radius). 


e Implicit differentiation will then allow you to link the rate of change of one quantity 
to another. 


Another balloon example 


Example 3.2.1 


Consider a helium balloon rising vertically from a fixed point 200m away from you. You 
are trying to work out how fast it is rising. Now — computing the velocity directly is 
difficult, but you can measure angles. You observe that when it is at an angle of 7/4 its 
angle is changing by 0.05 radians per second. 


e Start by drawing a picture with the relevant variables 


e So denote the angle to be @ (in radians), the height of the balloon (in m) by h and 
time (in seconds) by t. Then trigonometry tells us 


h = 200 - tan @ 


e Differentiating allows us to relate the rates of change 


dh | >, dé 
di = 200 sec 0. 


e Weare told that when 0 = 77/4 we observe a“ = 0.05, so 


h 
dh = 200 - sec?(7/4) - 0.05 


di 
— 200 - 0.05. (v2) 


5 
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e So the balloon is rising at a rate of 20m/s. 
tC Example 321, 


The following problem is perhaps the classic related rate problem. 


Example 3.2.2 


A 5m ladder is leaning against a wall. The floor is quite slippery and the base of the ladder 
slides out from the wall at a rate of 1m/s. How fast is the top of the ladder sliding down 
the wall when the base of the ladder is 3m from the wall? 


e A good first step is to draw a picture stating all relevant quantities. This will also 
help us define variables and units. 


So now define x(t) to be the distance between the bottom of the ladder and the wall, 
at time t, and let y(t) be the distance between the top of the ladder and the ground 
at time t. Measure time in seconds, but both distances in meters. 


e We can relate the quantities using Pythagoras: 


ey =s 


Differentiating with respect to time then gives 
dx dy _ 


2n +2Y ae =0 


e We know that a = land x =3,s0 


6-142! =0 


but we need to determine y before we can go further. Thankfully we know that 
x? + y? = 25 and x = 3,so y* =25-9= 16 and" so y = 4. 


12 Since the ladder isn’t burried in the ground, we can discard the solution y = —4. 
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e So finally putting everything together 


6 ee ae =0 
dy 3 
ap ~gm/s. 


Thus the top of the ladder is sliding towards the floor at a rate of 3/4m/s. 


————EEEE Example 3.2.2 ad. 


The next example is complicated by the rates of change being stated not just as “the 
rate of change per unit time” but instead being stated as “the percentage rate of change 


per unit time”. If a quantity f is changing with rate 8 then we can say that 


df 


is changing at a rate of 100 - 44 percent. 
‘i ging ra 


Thus if, at time t, f has rate of change r%, then 


Fit) _ eee & 
100 =t — FO = All 


so that if h is a very small time increment 


fe+h)—f(t) sedi 
DoT = p(t) = flt+h) x FO + TF 


That is, over a very small time interval h, f increases by the fraction if of its value at time 
by 


So armed with this, let’s look at the problem. 


om Example 3.2.3 


The quantities P, Q and R are functions of time and are related by the equation R = PQ. 
Assume that P is increasing instantaneously at the rate of 8% per year (meaning that 


1005 = 8) and that Q is decreasing instantaneously at the rate of 2% per year (mean- 


ing that 100% = —2). Determine the percentage rate of change for R. 


Solution. This one is a little different — we are given the variables and the formula, so no 
picture drawing or defining required. Though we do need to define a time variable — let 
t denote time in years. 


e Since R(t) = P(t) - Q(t) we can differentiate with respect to ft to get 


AR gates), Bt 
Gp = PQ + QP 
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e But we need the percentage change in R, namely 


R’ PQ’ + QP’ 
100— = 100 —~—_—— 
00 00 R 


but R = PQ, so rewrite it as 


P / / 
p28 +QP 
PO 
POF. OP! 
PO | 100 5 
Q’ Pp’ 
= 100— + 100— 
gr 


= 1 


= 100 


so we have stated the instantaneous percentage rate of change in R as the sum of the 
percentage rate of change in P and Q. 


e We know the percentage rate of change of P and Q, so 


/ 


R 
1 a — 
00 +8=6 


That is, the instantaneous percentage rate of change of R is 6% per year. 


____ccpopmtccrts# Example 3.2.3 = 


Yet another falling object example. 


(a Example 3.2.4 


A ball is dropped from a height of 49m above level ground. The height of the ball at time 
tis h(t) = 49 — 4.91? m. A light, which is also 49m above the ground, is 10m to the left 
of the ball’s original position. As the ball descends, the shadow of the ball caused by the 
light moves across the ground. How fast is the shadow moving one second after the ball 
is dropped? 


Solution. There is quite a bit going on in this example, so read carefully. 


e First a diagram; the one below is perhaps a bit over the top. 
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e Let’s call s(t) the distance from the shadow to the point on the ground directly un- 
derneath the ball. 


e By similar triangles we see that 


4.97 49-497 
10 s(t) 


We can then solve for s(t) by just multiplying both sides by zons(t). This gives 


49—4.91? 100 


| = 
s(t) = 10 op 2 


10 


e Differentiating with respect to ¢ will then give us the rates, 


100 
s'(t) = ora 
e So, att = 1, s’(1) = —200m/sec. That is, the shadow is moving to the left at 


200m/sec. 


tC Example oa) 


A more nautical example. 
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Example 3.2.5 


Two boats spot each other in the ocean at midday — Boat A is 15km west of Boat B. Boat 
A is travelling east at 3km/h and boat B is travelling north at 4km/h. At 3pm how fast is 
the distance between the boats changing. 


e First we draw a picture. 


e Let x(t) be the distance at time t, in km, from boat A to the original position of boat 
B (i.e. to the position of boat B at noon). And let y(t) be the distance at time ft, in 
km, of boat B from its original position. And let z(t) be the distance between the two 
boats at time f. 


e Additionally we are told that x’ = —3 and y’ = 4 — notice that x’ < 0 since that 
distance is getting smaller with time, while y’ > 0 since that distance is increasing 
with time. 


e Further at 3pm boat A has travelled 9km towards the original position of boat B, so 
x = 15—9 = 6, while boat B has travelled 12km away from its original position, so 
y=12, 


e The distances x,y and z form a right-angled triangle, and Pythagoras tells us that 
a a ‘a 
At 3pm we know x = 6,y = 12 so 
2? = 36 + 144 = 180 


z = V180 = 6v5. 
e Differentiating then gives 
dz. dx dy 
ag ag oY a 
= 12-(—3) + 24. (4) 
= 60. 
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Dividing through by 2z = 12\/5 then gives 


dz 60 5 
= — at = /5 
dt 12/5 5 


So the distance between the boats is increasing at /5km/nh. 


a Example 325} JI 


One last one before we move on to another topic. 


Example 3.2.6 


Consider a cylindrical fuel tank of radius r and length L (in some 

appropriate units) that is lying on its side. Suppose that fuel is being 

pumped into the tank at a rate gq. At what rate is the fuel level 

rising? 
Solution. If the tank were vertical everything would be much easier. Unfortunately the 
tank is on its side, so we are going to have to work a bit harder to establish the relation 
between the depth and volume. Also notice that we have not been supplied with units for 
this problem — so we do not need to state the units of our variables. 


e Again — draw a picture. Here is an end view of the tank; the shaded part of the 
circle is filled with fuel. 


e Let us denote by V(t) the volume of fuel in the tank at time t and by h(t) the fuel 
level at time f. 


e We have told that V’(t) = q and have been asked to determine h'(t). While it is 
possible to do so by finding a formula relating V(t) and h(t), it turns out to be quite 
a bit easier to first find a formula relating V and the angle @ shown in the end view. 
We can then translate this back into a formula in terms of h using the relation 


h(t) =r—rcos6(t). 
Once we know 6’(t), we can easily obtain h’(t) by differentiating the above equation. 
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e The computation that follows below gets a little involved in places, so we will drop 
the “(tf)” on the variables V,h and 0. The reader must never forget that these three 
quantities are really functions of time, while r and L are constants that do not depend 
on time. 


e The volume of fuel is L times the cross-sectional area filled by the fuel. That is, 
V=L~x Area ( —' ) 


While we do not have a canned formula for the area of a chord of a circle like this, it 
is easy to express the area of the chord in terms of two areas that we can compute. 


V = Lx Area( waar) = Lx [Area Ler) JIS) 


<S 


— The piece of pie is the fraction zh of the full circle, so its area is 
Sree 


S 
— The triangle VAXe as height r cos @ and base 27 sin 6 and hence has area 


5(rcos@)(2rsin@) = r*sin@cosé = at sin(20), where we have used a double- 


angle formula (see Appendix A.12). 


Subbing these two areas into the above expression for V gives 
2 Lr2 
V=Lx [or 7 5 sin26 = = [20 — sin 26] 


Oof! 


e Now we can differentiate to find the rate of change. Recalling that V = V(t) and 
6 = 0(t), while r and L are constants, 
Lr2 
VS > [20’ — 2 cos 20 - 6" 
= Lr’ -6'- [1 —cos26] 


Solving this for 6’ and using V’ = q gives 


6’ = i 
Lr2(1 — cos 20) 


This is the rate at which @ is changing, but we need the rate at which h is changing. 


We get this from 
h=r-—rcosé differentiating this gives 
h’ = sind. 6 
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Substituting our expression for 6’ into the expression for h’ gives 


h’ — oj 7 q 
—_ Lr?(1— cos 20) 


e We can clean this up a bit more — recall more double-angle formulas! 


h' = sin@- re T 26) substitute cos 26 = 1 — 2sin26 


q 
Lr2 .2. sin? 0 


= siné- 


= q 
2Lr2 sin 0 


e But we can clean this up even more — instead of writing this rate in terms of @ it is 
more natural to write it in terms of h (since the initial problem is stated in terms of 
h). From the triangle 


and Pythagoras we have 


2 _ F2 
pie s/t? —(r—h) _ V2rh —h 
r r 
and hence 
a 
2LV/2rh — h2 


e As acheck, notice that h’ becomes undefined when h < 0 and also when h > 2r, 
because then the argument of the square root in the denominator is negative. Both 
make sense — the fuel level in the tank must obey 0 <h < 2r. 


ee Example 326, J 


3.3 4 Exponential growth and decay — a first look at differential 
equations 


A differential equation is an equation for an unknown function that involves the deriva- 
tive of the unknown function. For example, Newton’s law of cooling says: 


13 Take another look at Appendix A.12. d 
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The rate of change of temperature of an object is proportional to the difference 
in temperature between the object and its surroundings. 


We can write this more mathematically using a differential equation — an equation for the 
unknown function T(t) that also involves its derivative aT (t). If we denote by T(t) the 
temperature of the object at time t and by A the temperature of its surroundings, Newton's 
law of cooling says that there is some constant of proportionality, K, such that 


dT 
S(t) = K[T(t) = 4] 


Differential equations play a central role in modelling a huge number of different 
phenomena, including the motion of particles, electromagnetic radiation, financial op- 
tions, ecosystem populations and nerve action potentials. Most universities offer half a 
dozen different undergraduate courses on various aspects of differential equations. We 
are barely going to scratch the surface of the subject. At this point we are going to restrict 
ourselves to a few very simple differential equations for which we can just guess the solu- 
tion. In particular, we shall learn how to solve systems obeying Newton’s law of cooling 
in Section 3.3.2, below. But first, here is another slightly simpler example. 


3.3.1 » Carbon dating 


Scientists can determine the age of objects containing organic material by a method called 
carbon dating or radiocarbon dating'*. Cosmic rays hitting the atmosphere convert nitrogen 
into a radioactive isotope of carbon, '*C, with a half-life of about 5730 years!°. Vegetation 
absorbs carbon dioxide from the atmosphere through photosynthesis and animals acquire 
'4C by eating plants. When a plant or animal dies, it stops replacing its carbon and the 
amount of C begins to decrease through radioactive decay. More precisely, let Q(t) 
denote the amount of !C in the plant or animal t years after it dies. The number of 
radioactive decays per unit time, at time t, is proportional to the amount of *C present at 
time t, which is Q(t). Thus 


Equation 3.3.1 (Radioactive decay). 


dQ 


S(t) = -kQ( 


Here k is a constant of proportionality that is determined by the half-life. We shall explain 
what half-life is and also determine the value of k in Example 3.3.3, below. Before we do 
so, let’s think about the sign in equation (3.3.1). 


e Recall that Q(t) denotes a quantity, namely the amount of !*C present at time t. 
There cannot be a negative amount of 'C, nor can this quantity be zero (otherwise 
we wouldn’t use carbon dating, so we must have Q(t) > 0. 


14 Willard Libby, of Chicago University was awarded the Nobel Prize in Chemistry in 1960, for developing d 


radiocarbon dating. 
15 A good question to ask yourself is “How can a scientist (who presumably doesn’t live 60 centuries) 
measure this quantity?” One way exploits the little piece of calculus we are about to discuss. 
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e As the time t increases, Q(t) decreases, because 4C is being continuously converted 
into '4N by radioactive decay’. Thus 9 (t) =U, 


e The signs Q(t) > 0 and a9 (t) < 0 are consistent with equation (3.3.1) provided the 
constant of proportionality k > 0. 


e In equation (3.3.1), we chose the call the constant of proportionality “—k”. We did 
so in order to make k > 0. We could just as well have chosen to call the constant 
of proportionality “K”. That is, we could have replaced equation (3.3.1) by oe (= 
KQ(t). The constant of proportionality K would have to be negative, (and K and k 
would be related by K = —k). 


Now, let’s guess some solutions to equation (3.3.1). We wish to guess a function Q(t) 
whose derivative is just a constant times itself. Here is a short table of derivatives. It is 
certainly not complete, but it contains the most important derivatives that we know. 


F(t) || 1) # | sint | cost | tant | e' | logt | arcsint | arctant 


F(t) || 0 | at*-1 | cost | —sint | sect | ef | } ee on 
There is exactly one function in this table whose derivative is just a (nonzero) constant 
times itself. Namely, the derivative of e! is exactly e' = 1 x e'. This is almost, but not 
quite what we want. We want the derivative of Q(t) to be the constant —k (rather than the 
constant 1) times Q(t). We want the derivative to “pull a constant” out of our guess. That 
is exactly what happens when we differentiate e, where a is a constant. Differentiating 
gives 


eA 


i.e. “pulls the constant a out of e””. 


We have succeeded in guessing a single function, namely e~, that obeys equation (3.3.1). 

Can we guess any other solutions? Yes. If C is any constant, Ce also obeys equa- 
tion (3.3.1): 
d —kt d —kt —kt —kt 
qi te ) rae age = Ce~“'(—k) = —k(Ce“*) 
You can try guessing some more solutions, but you won’t find any, because with a little 
trickery we can prove that a function Q(t) obeys equation (3.3.1) if and only if Q(t) is of 
the form Ce—*, where C is some constant. 

The trick!” is to imagine that Q(t) is any (at this stage, unknown) solution to (3.3.1) 
and to compare Q(t) and our known solution e~ by studying the ratio Q(t)/e~*!. We 
will show that Q(t) obeys (3.3.1) if and only if the ratio Q(t) /e~™ is a constant, ie. if and 
only if the derivative of the ratio is zero. By the product rule 


[awe] - < [eto] = ke Q(t) + Q(t) 


16 The precise transition is Mc _, 14N + e~ +, where e~ is an electron and 7; is an electron neutrino. d 


17 Notice that is very similar to what we needed in Example 3.1.2, except that here the constant is multi- 
plicative rather than additive. That is const x f(t) rather than const + f(t). 
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Since e is never 0, the right hand side is zero if and only if kQ(t) + Q’(t) = 0; that is 
Q'(t) = —kQ(t). Thus 
Flt) = KOU) > F [QU /eH] = 0 
dt dt 
as required. 
We have succeed in finding all functions that obey, i.e. the general solution to, (3.3.1). 


This is worth stating as a theorem. 


Theorem 3.3.2. 


A differentiable function Q(t) obeys the differential equation 


dQ 
& (f) = -kQ(#) 


if and only if there is a constant C such that 


Q(t) = Ce™ 


Before we start to apply the above theorem, we take this opportunity to remind the 
reader that in this text we will use log x with no base to indicate the natural logarithm. 
That is 


log x = log, x = Inx 


Both of the notations log(x) and In(x) are used widely and the reader should be comfort- 
able with both. 


Example 3.3.3 


In this example, we determine the value of the constant of proportionality k in equa- 
tion (3.3.1) that corresponds to the half-life of !4C, which is 5730 years. 


Imagine that some plant or animal contains a quantity Qo of '4C at its time of death. 
Let’s choose the zero point of time t = 0 to be the instant that the plant or animal 
died. 


Denote by Q(t) the amount of !*C in the plant or animal f years after it died. Then 
Q(t) must obey both equation (3.3.1) and Q(0) = Qo. 


Since Q(t) must obey equation (3.3.1), Theorem 3.3.2 tells us that there must be a 
constant C such that Q(t) = Ce. To also have Qo = Q(0) = Ce~**9, the constant 
C must be Qo. That is, Q(t) = Qoe™ for all t > 0. 


By definition, the half-life of C is the length of time that it takes for half of the 4C 
to decay. That is, the half-life t; 2 is determined by 


Q(t1/2) = 5Q(0) — 5 Qo but we know Q(t) = Qoe~* 
Qoe~*"1/2 = $.Q now cancel Qo 
e Kti/2 = 4 
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Taking the logarithm of both sides gives 


—kty2 = log 5 = —log2 and so 
pe log2 
fy /2 


We are told that, for !4C, the half-life t1/2 = 5730, so 


_ log 2 
~ 5730 


tC Example a 


From the work in the above example we have accumulated enough new facts to make 
a corollary to Theorem 3.3.2. 


= 0.000121 to 6 digits 


Corollary 3.3.4. 


The function Q(t) satisfies the equation 


dQ _ 
SH = -KO(H) 


if and only if 


The half-life is defined to be the time t; 2 which obeys 


Oltr/2) = 5+ Q(0). 


The half-life is related to the constant k by 


log 2 
1/2= ie 


Now here is a typical problem that is solved using Corollary 3.3.4. 


co Example ——$<<—<<—$———$<$<—<$——_—___—_—} 


A particular piece of parchment contains about 64% as much 'C as plants do today. 
Estimate the age of the parchment. 


Solution. Let Q(t) denote the amount of *C in the parchment t years after it was first 
created. By equation (3.3.1) and Example 3.3.3, 


log 2 


= 0.000121. 
5730 0.000 


<< = —-kQ(t)  withk = 
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By Corollary 3.3.4 


The time at which Q(t) reaches 0.64Q(0) is determined by 


Q(t) = 0.64Q(0) but Q(t) = Q(0)e™ 
Q(0)e* — 0.64Q(0) cancel Q(0) 
et — 0.64 take logarithms 
—kt = log 0.64 
— log0.64 ~—s log 0.64 | wots — 
b= y= =pqngia. 3700 to 2 significant digits. 


That is, the parchment!® is about 37 centuries old. 


Example 3.3.5 =) 


We have stated that the half-life of 4C is 5730 years. How can this be determined? We 
can explain this using the following example. 


co Example 3.3.6 


A scientist in a B-grade science fiction film is studying a sample of the rare and fictitious 
element, implausium!’. With great effort he has produced a sample of pure implausium. 
The next day — 17 hours later — he comes back to his lab and discovers that his sample 
is now only 37% pure. What is the half-life of the element? 


Solution. We can again set up our problem using Corollary 3.3.4. Let Q(t) denote the 
quantity of implausium at time t, measured in hours. Then we know 


We also know that 
Q(17) = 0.37Q(0). 


That enables us to determine k via 


Q(17) = 0.37Q(0) = Q(0)e1* divide both sides by Q(0) 
ear mes 
and so 
— log0.37 | 
k=- 7. 0.05849 


We can then convert this to the half life using Corollary 3.3.4: 


log 2 


tij2 = ~ 11.85 hours 


18 The British Museum has an Egyptian mathematical text from the seventeenth century B.C. n 


19 Implausium leads to even weaker plots than unobtainium. 
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While this example is entirely fictitious, one really can use this approach to measure the 


half-life of materials. 
Ce Example 336 


3.3.2 » Newton’s law of cooling 


Recall Newton’s law of cooling from the start of this section: 


The rate of change of temperature of an object is proportional to the difference 
in temperature between the object and its surroundings. The temperature of 
the surroundings is sometimes called the ambient temperature. 


We translated this statement into the following differential equation 


Equation 3.3.7 (Newton’s law of cooling). 


where T(t) is the temperature of the object at time t, A is the temperature of its surround- 
ings, and K is a constant of proportionality. This mathematical model of temperature 
change works well when studying a small object in a large, fixed temperature, environ- 
ment. For example, a hot cup of coffee in a large room?’. 

Before we worry about solving this equation, let’s think a little about the sign of the 
constant of proportionality. At any time t, there are three possibilities. 


e If T(t) > A, that is, if the body is warmer than its surroundings, we would expect 
heat to flow from the body into its surroundings and so we would expect the body to 
cool off so that dt (t) < 0. For this expectation to be consistent with equation (3.3.7), 
we need K < 0. 


e If T(t) < A, that is the body is cooler than its surroundings, we would expect heat to 
flow from the surroundings into the body and so we would expect the body to warm 
up so that dt (t) > 0. For this expectation to be consistent with equation (3.3.7), we 
again need K < 0. 


e Finally if T(t) = A, that is the body and its environment have the same temperature, 
we would not expect any heat to flow between the two and so we would expect that 
dT (t) = 0. This does not impose any condition on K. 

In conclusion, we would expect K < 0. Of course, we could have chosen to call the 

constant of proportionality —k, rather than K. Then the differential equation would be 

dT = —k(T — A) and we would expect k > 0. 


20 It does not work so well when the object is of a similar size to its surroundings since the temperature of d 


the surroundings will rise as the object cools. It also fails when there are phase transitions involved — 
for example, an ice-cube melting in a warm room does not obey Newton’s law of cooling. 
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Now to find the general solution to equation (3.3.7). Since this equation is so similar in 
form to equation (3.3.1), we might expect a similar solution. Start by trying T(t) = Ce! 
and let’s see what goes wrong. Substitute it into the equation: 


dT 
—=K(T(t)-A 
ar = K(T(#) - A) 
KCe™ = KCeX?— KA 
20] —=KRAL the constant A causes problems! 


Let’s try something a little different — recall that the derivative of a constant is zero. So 
we can add or subtract a constant from T(t) without changing its derivative. Set Q(t) = 
T(t) + B, then 


ate = ott) by Newton’s law of cooling 
= K(T(#)— A) = K(Q(t)-B-A) 
So if we choose B = —A then we will have 


dQ 
F(t) = KQW) 


which is exactly the same form as equation (3.3.1), but with K = —k. So by Theorem 3.3.2 
Q(t) = Q(0)e 


We can translate back to T(t), since Q(t) = T(t) — A and Q(0) = T(0) — A. This gives us 
the solution. 


Corollary 3.3.8. 


A differentiable function T(t) obeys the differential equation 


dT 
S(t) = KIT) - 4] 


if and only if 


Just before we put this into action, we remind the reader that log x = log, x = Inx. 


co Example 3.3.9 


The temperature of a glass of iced tea is initially 5°. After 5 minutes, the tea has heated to 
10° in a room where the air temperature is 30°. 


(a) Determine the temperature as a function of time. 


(b) What is the temperature after 10 minutes? 
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(c) Determine when the tea will reach a temperature of 20°. 


Solution. Part (a) 

e Denote by T(t) the temperature of the tea tf minutes after it was removed from the 

fridge, and let A = 30 be the ambient temperature. 

e By Newton’s law of cooling, 
dT 
dt 

for some, as yet unknown, constant of proportionality K. 


= K(T — A) = K(T —30) 


e By Corollary 3.3.8, 
T(t) = [T(0) —30] eX! + 30 = 30 — 25e*! 
since the initial temperature T(0) = 5. 


e This solution is not complete because it still contains an unknown constant, namely 
K. We have not yet used the given data that T(5) = 10. We can use it to determine 


K. Att =5, 

T(5) = 30 — 25e°* = 10 rearrange 

20 

5K __ “U 

© 35 

2 
5K = log a and so 
1 4 a2 
K= 5 log = —0.044629 to 6 digits 
Part (b) 


e To find the temperature at 10 minutes we can just use the solution we have deter- 
mined above. 


T(10) = 30 — 25e!0* 
= 30 — 25¢!0* 51085 


— 30 — 25¢e2!085 — 30 — 25¢!°8 18 


= 30-16 = 14° 
Part (c) 
e We can find when the temperature is 20° by solving T(t) = 20: 
20 = 30 — 25e*! rearrange 
xe_ 10 _ 2 
2 9 
Kt = log = 
08 5 
_ log 2 
OK 
= 20.5 minutes to 1 decimal place 
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tC Example 3.3.9 = 


A slightly more gruesome example. 


om Example 3.3.10 


A dead body is discovered at 3:45pm in a room where the temperature is 20°C. At that time 
the temperature of the body 1s 27°C. Two hours later, at 5:45pm, the temperature of the 
body is 25.3 °C. What was the time of death? Note that the normal (adult human) body 
temperature is 37°. 


Solution. We will assume”! that the body’s temperature obeys Newton’s law of cooling. 


e Denote by T(t) the temperature of the body at time t, with t = 0 corresponding to 
3:45pm. We wish to find the time of death — call it tg. 


There is a lot of data in the statement of the problem; we are told that 


the ambient temperature: A = 20 


the temperature of the body when discovered: T(0) = 27 
the temperature of the body 2 hours later: T(2) = 25.3 


assuming the person was a healthy adult right up until he died, the temperature 
at the time of death: T(ty) = 37. 


e Since we assume the temperature of the body obeys Newton’s law of cooling, we 
use Corollary 3.3.8 to find, 


T(t) = [T(0) — A]eX! + A = 204 7e*! 


Two unknowns remain, K and tg. 


We can find the constant K by using T(2) = 25.3: 


25.3 = T(2) = 20+ 7e** rearrange 
767K — 5.3 rearrange a bit more 

2K = log (>) 
K = 5 log (33) = —0.139 to 3 decimal places 


Since we know” that ty is determined by T(t,) = 37, we have 


87 = Pity = 20-472 Fe rearrange 
e0:13%q — 7 
—0.139tg = log (+7) 
ta = —gi39 log (7) 
= —6.38 to 2 decimal places 


21 We don’t know any other method! d 


22 Actually, we are assuming again. 
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Now 6.38 hours is 6 hours and 0.38 x 60 = 23 minutes. So the time of death was 6 
hours and 23 minutes before 3:45pm, which is 9:22am. 


tC Example 33.10—J 


A slightly tricky example — we need to determine the ambient temperature from three 
measurements at different times. 


om Example 3.3.11 


A glass of room-temperature water is carried out onto a balcony from an apartment where 
the temperature is 22°C. After one minute the water has temperature 26°C and after two 
minutes it has temperature 28°C. What is the outdoor temperature? 


Solution. We will assume that the temperature of the thermometer obeys Newton's law 
of cooling. 


e Let A be the outdoor temperature and T(t) be the temperature of the water t minutes 
after it is taken outside. 


e By Newton’s law of cooling, 
T(t) = A+ (T(0) — A)e™ 


by Corollary 3.3.8. Notice there are 3 unknowns here — A, T(0) and K — so we need 
three pieces of information to find them all. 


e Weare told T(0) = 22, so 
T(t) = A+ (22-A)e™. 
e Weare also told T(1) = 26, which gives 


26 = A+ (22—- A) ge rearrange things 
Jk _ 26-A 
— -22-A 
e Finally, T(2) = 28, so 
28 = A+ (22— A)e?* rearrange 
28-—A 26—A 
2K _ a 
e = 55 but e = 55g" 8° 
26-A\* 28-A 
(5 a) = = A multiply through by (22 — A)? 


(26 — A)? = (28 — A)(22— A) 
We can expand out both sides and collect up terms to get 
267 -52A + A? = 28 x 22-50A + A? 
SS” ——— 
=676 =616 
60 = 2A 


So the temperature outside is 30°. 
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Cape 3.3.11 SI 


3.3.3 » Population growth 


Suppose that we wish to predict the size P(t) of a population as a function of the time 
t. In the most naive model of population growth, each couple produces 6 offspring (for 


some constant f) and then dies. Thus over the course of one generation pre children are 
produced and P(t) parents die so that the size of the population grows from P(t) to 
PU 
P(t-+ty) = P(t) + pe — pry =F) 
2 Cw 3 
parents die 


parents+offspring 


where t, denotes the lifespan of one generation. The rate of change of the size of the 
population per unit time is 


P(t-+ty) —P(t) _ 1 (Bp 


where b = OE is the net birthrate per member of the population per unit time. If we 


approximate 


we get the differential equation 


Equation 3.3.12 (Simple population model). 


oF oe 


dt 


By Corollary 3.3.4, with —k replaced by b, 


This is called the Malthusian”’ growth model. It is, of course, very simplistic. One of its 
main characteristics is that, since P(t-++ T) = P(0) - (+7) = P(t) - eT, every time you 
add T to the time, the population size is multiplied by e’™. In particular, the population 
size doubles every age units of time. The Malthusian growth model can be a reasonably 


good model only when the population size is very small compared to its environment”. 


23 This is named after Rev. Thomas Robert Malthus. He described this model in a 1798 paper called “An d 


essay on the principle of population”. 
24 That is, the population has plenty of food and space to grow. 
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A more sophisticated model of population growth, that takes into account the “carrying 
capacity of the environment” is considered in the optional subsection below. 


Example 3.3.13 


In 1927 the population of the world was about 2 billion. In 1974 it was about 4 billion. Esti- 
mate when it reached 6 billion. What will the population of the world be in 2100, assuming 
the Malthusian growth model? 


Solution. We follow our usual pattern for dealing with such problems. 

e Let P(t) be the world’s population t years after 1927. Note that 1974 corresponds to 
t = 1974 — 1927 = 47. 

e We are assuming that P(t) obeys equation (3.3.12). So, by Corollary 3.3.4 with —k 
replaced by b, 

P(t) = P(0)-e” 

Notice that there are 2 unknowns here — b and P(0) — so we need two pieces of 
information to find them. 


e Weare told P(0) = 2, so 


P(t) =2-e" 
e Weare also told P(47) = 4, which gives 
t= oe" clean up 
ereso take the log and clean up 
b= ose = 0.0147 to 3 decimal places 


e We now know P(t) completely, so we can easily determine the predicted popula- 
tion”? in 2100, ie. at t = 2100 — 1927 = 173. 


P73) = 20/0? = pp Ox = 127 billion 


e Finally, our crude model predicts that the population is 6 billion at the time ¢ that 


obeys 
P(t) = 2e"* —6 clean up 
et —3 take the log and clean up 
p= 083 _ 4783 _ 145 
b log 2 


which corresponds” to the middle of 2001. 


Ce Example 33.13} J 
25 The 2015 Revision of World Population, a publication of the United Nations, predicts that the world’s d 


population in 2100 will be about 11 billion. But “about” covers a pretty large range. They give an 80% 
confidence interval running from 10 billion to 12.5 billion. 
26 The world population really reached 6 billion in about 1999. 
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>>> (optional) — Logistic population growth 


Logistic growth adds one more wrinkle to the simple population model. It assumes that 
the population only has access to limited resources. As the size of the population grows 
the amount of food available to each member decreases. This in turn causes the net birth 
rate b to decrease. In the logistic growth model b = bo (1 — Py, where K is called the 
carrying capacity of the environment, so that 


P'(t) = bo ( — =) P(t) 


We can learn quite a bit about the behaviour of solutions to differential equations like 
this, without ever finding formulae for the solutions, just by watching the sign of P’(t). 
For concreteness, we'll look at solutions of the differential equation 


dP 
t 


a(t) = (6000 — 3P(t) ) P(t) 


We'll sketch the graphs of four functions P(t) that obey this equation. 


e For the first function, P(0) = 0. 
e For the second function, P(0) = 
e For the third function, P(0) = 200 

e For the fourth function, P(0) = a 


ae 


The sketches will be based on the observation that (6000 — 3P) P = 3(2000 — P) P 


e is zero for P = 0, 2000, 
e is strictly positive for 0 < P < 2000 and 
e is strictly negative for P > 2000. 


Consequently 


=0 if P(t) =0 
dP >0 if0< P(t) < 2000 
de V0 if P(t) = 2000 
<0 if P(t) > 2000 


Thus if P(t) is some function that obeys 2°(+) = (6000 — 3P(t)) P(t), then as the graph of 
P(t) passes through (t, P(t)) 


slope zero, i.e. is horizontal, if P(t) = 0 
fn positive slope, i.e. isincreasing, if 0 < P(t) < 2000 
Prey, slope zero, ie. is horizontal, if P(t) = 2000 


negative slope, i.e. is decreasing, if 0 < P(t) < 2000 


as illustrated in the figure 
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As a result, 


e if P(O) = 0, the graph starts out horizontally. In other words, as ¢ starts to increase, 
P(t) remains at zero, so the slope of the graph remains at zero. The population size 
remains zero for all time. As a check, observe that the function P(t) = 0 obeys 
dP (t) = (6000 — 3P(t)) P(£) for all ¢. 


e Similarly, if P(0) = 2000, the graph again starts out horizontally. So P(t) remains at 
2000 and the slope remains at zero. The population size remains 2000 for all time. 
Again, the function P(t) = 2000 obeys 4°(+) = (6000 — 3P(t)) P(t) for all t. 


e If P(O0) = 1000, the graph starts out with positive slope. So P(t) increases with t. As 
P(t) increases towards 2000, the slope (6000 — 3P(t)) P(t), while remaining positive, 
gets closer and closer to zero. As the graph approaches height 2000, it becomes more 
and more horizontal. The graph cannot actually cross from below 2000 to above 
2000, because to do so, it would have to have strictly positive slope for some value 
of P above 2000, which is not allowed. 


e If P(0) = 3000, the graph starts out with negative slope. So P(t) decreases with 
t. As P(t) decreases towards 2000, the slope (6000 — 3P(t)) P(t), while remaining 
negative, gets closer and closer to zero. As the graph approaches height 2000, it 
becomes more and more horizontal. The graph cannot actually cross from above 
2000 to below 2000, because to do so, it would have to have negative slope for some 
value of P below 2000. which is not allowed. 


These curves are sketched in the figure below. We conclude that for any initial population 
size P(0), except P(0) = 0, the population size approaches 2000 as t > 0. 


Figure 3.3.1. 
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3.4 « Approximating functions near a specified point — Taylor 
polynomials 
Suppose that you are interested in the values of some function f(x) for x near some fixed 


point a. When the function is a polynomial or a rational function we can use some arith- 
metic (and maybe some hard work) to write down the answer. For example: 


2 
=o 
P(x) = x2-2x+4 
1 1-75 
f(1/5) = 73> = 
~ 4 24 4 ~~ 1—10+100 

2 5 '! 25 
—7A4 
91 


Tedious, but we can do it. On the other hand if you are asked to compute sin(1/10) then 
what can we do? We know that a calculator can work it out 


sin(1/10) = 0.09983341... 


but how does the calculator do this? How did people compute this before calculators?” ? 
A hint comes from the following sketch of sin(x) for x around 0. 


Figure 3.4.1. 


The above figure shows that the curves y = x and y = sin x are almost the same when x 
is close to 0. Hence if we want the value of sin(1/10) we could just use this approximation 
y = x to get 


sin(1/10) ~ 1/10. 


Of course, in this case we simply observed that one function was a good approximation 
of the other. We need to know how to find such approximations more systematically. 

More precisely, say we are given a function f(x) that we wish to approximate close to 
some point x = a, and we need to find another function F(x) that 


e is simple and easy to compute*® 


27 Originally the word “calculator” referred not to the software or electronic (or even mechanical) device d 


we think of today, but rather to a person who performed calculations. 
28 Itis no good approximating a function with something that is even more difficult to work with. 
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e is a good approximation to f(x) for x values close to a. 


Futher, we would like to understand how good our approximation actually is. Namely 
we need to be able to estimate the error | f(x) — F(x). 

There are many different ways to approximate a function and we will discuss one 
family of approximations: Taylor polynomials. This is an infinite family of ever improving 
approximations, and our starting point is the very simplest. 


3.4.1 » Zeroth approximation — the constant approximation 


The simplest functions are those that are constants. And our zeroth?’ approximation will 
be by a constant function. That is, the approximating function will have the form F(x) = 
A, for some constant A. Notice that this function is a polynomial of degree zero. 

To ensure that F(x) is a good approximation for x close to a, we choose A so that f (x) 
and F(x) take exactly the same value when x = a. 


Moje=A. 26 FanaAjfiag) = AjSia 


Our first, and crudest, approximation rule is 


Equation 3.4.1 (Constant approximation). 


An important point to note is that we need to know f(a) — if we cannot compute that 
easily then we are not going to be able to proceed. We will often have to choose a (the 
point around which we are approximating f(x)) with some care to ensure that we can 
compute f(a). 

Here is a figure showing the graphs of a typical f(x) and approximating function F(x). 
Atx =a, f(x) and F(x) take the same value. For x very near a, the values of f (x) and F(x) 


remain close together. But the quality of the approximation deteriorates fairly quickly as x 
moves away from a. Clearly we could do better with a straight line that follows the slope 
of the curve. That is our next approximation. 


29 It barely counts as an approximation at all, but it will help build intuition. Because of this, and the fact 
that a constant is a polynomial of degree 0, we'll start counting our approximations from zero rather 
than 1. 
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But before then, an example: 


Example 3.4.2 


Use the constant approximation to estimate e”!. 
Solution. First set f(x) =e. 


e Now we first need to pick a point x = a to approximate the function. This point 
needs to be close to 0.1 and we need to be able to evaluate f(a) easily. The obvious 
choice is a = 0. 


e Then our constant approximation is just 


Note that e®-! = 1.105170918 ..., so even this approximation isn’t too bad.. 


Example 342}—I 


3.4.2 »» First approximation — the linear approximation 


Our first?? approximation improves on our zeroth approximation by allowing the approx- 
imating function to be a linear function of x rather than just a constant function. That is, 
we allow F(x) to be of the form A + Bx, for some constants A and B. 

To ensure that F(x) is a good approximation for x close to a, we still require that f (x) 
and F(x) have the same value at x = a (that was our zeroth approximation). Our ad- 
ditional requirement is that their tangent lines at x = a have the same slope — that the 
derivatives of f(x) and F(x) are the same at x = a. Hence 


F(x) = A+ Bx = F(a) = A+Ba = f(a) 
F(x) =B —= F(a) = P= 7 a) 
So we must have B = f’(a). Substituting this into A+ Ba = f(a) we get A = f(a) —af"(a). 
So we can write 
A 


7 7 
P(x) = A+ Bx = f(a) —af'(a) +f'(a) “x 
= f(a) + f'(@) - (x-4) 
We write it in this form because we can now clearly see that our first approximation is just 


an extension of our zeroth approximation. This first approximation is also often called the 
linear approximation of f(x) about x = a. 


Equation 3.4.3 (Linear approximation). 


F(x) * fla) + f'(a)(x—a) 
30 Recall that we started counting from zero. d 
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We should again stress that in order to form this approximation we need to know f(a) and 
f'(a) — if we cannot compute them easily then we are not going to be able to proceed. 
Recall, from Theorem 2.3.2, that y = f(a) + f’(a)(x — a) is exactly the equation of the 
tangent line to the curve y = f(x) ata. Here is a figure showing the graphs of a typical 
f(x) and the approximating function F(x). Observe that the graph of f(a) + f’(a)(x — a) 


remains close to the graph of f(x) for a much larger range of x than did the graph of our 
constant approximation, f(a). One can also see that we can improve this approximation 
if we can use a function that curves down rather than being perfectly straight. That is our 
next approximation. 

But before then, back to our example: 


Example 3.4.4 


Use the linear approximation to estimate e°!. 


Solution. First set f(x) = e* and a = 0 as before. 


e To form the linear approximation we need f(a) and f'(a): 


f(x) =e" f(0 
fia) =e f'(0) = 


e Then our linear approximation is 


F(x) = f(0)+ xf’(0) =14+-x 
F(0.1) =1.1 


Recall that e®-! = 1.105170918 ..., so the linear approximation is almost correct to 3 digits. 


Cape 344,—I 


It is worth doing another simple example here. 


Example 3.4.5 


Use a linear approximation to estimate V4.1. 


Solution. First set f(x) = ./x. Hence f’(x) = Ee Then we are trying to approximate 
f (4.1). Now we need to choose a sensible a value. 
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e We need to choose a so that f(a) and f’(a) are easy to compute. 
— We could try a = 4.1 — but then we need to compute f(4.1) and f’(4.1) — 
which is our original problem and more! 
— We could try a = 0 — then f(0) = Oand f’(0) = DNE. 


— Setting a = 1 gives us f(0) = 1 and f’(0) = }. This would work, but we can 
get a better approximation by choosing a is closer to 4.1. 


- Indeed we can set a to be the square of any rational number and we'll get a 
result that is easy to compute. 


— Setting a = 4 gives f(4) = 2 and f’(4) = }. This seems good enough. 


e Substitute this into equation (3.4.3) to get 


(4.1) ~ f(4) + f'(4) - (41-4) 


0.1 
=At = 20,025 = 2.025 


Notice that the true value is \/4.1 = 2.024845673.... 


ee 345} J 


3.4.3 » Second approximation — the quadratic approximation 


We next develop a still better approximation by now allowing the approximating function 
be to a quadratic function of x. That is, we allow F(x) to be of the form A + Bx + Cx’, for 
some constants A, B and C. To ensure that F(x) is a good approximation for x close to a, 
we choose A, B and C so that 


e f(a) = F(a) (just as in our zeroth approximation), 
e f'(a) = F(a) and f”(a) = F” (a) (just as in our first approximation), and 
e f”(a) = F” (a) — this is a new condition. 


These conditions give us the following equations 


F(x) = A+ Bx + Cx? => F(a) =A+Ba+ Ca? = f(a) 
F'(x) =B+2Cx = Pigs B+2Ca = f'(a) 
Ey) =e = eg 20=F (a) 
Solve these for C first, then B and finally A. 
C= sf) substitute 
B= f'(a) —2Ca = f'(a) —af"(a) substitute again 


A = f(a) — Ba—Ca® = f(a) — alf'(a) — af"(a)] — 3f"(a)a? 
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Then put things back together to build up F(x): 


F(x) = f(a) — f'(a)a + $f" (a)a* (this line is A) 
4 f(a) x — f"(a)ax (this line is Bx) 
+1 f"(a)x? (this line is Cx*) 
= fa) + f(a)(x— 4) + 5f"@)(x-a)? 


Oof! We again write it in this form because we can now clearly see that our second ap- 
proximation is just an extension of our first approximation. 
Our second approximation is called the quadratic approximation: 


Equation 3.4.6 (Quadratic approximation). 


f(x) ~ f(a) + f'(a)(x—a) + 5f"(a) (x4)? 


Here is a figure showing the graphs of a typical f(x) and approximating function F(x). 


= f(2) 
y = F(x) = f(a) + fi@)(@— a) + 5 f"(a)(x — a)? 


This new approximation looks better than both the first and second. 
Now there is actually an easier way to derive this approximation, which we show you 
now. Let us rewrite*! F(x) so that it is easy to evaluate it and its derivatives at x = a: 


F(x) =a+B-(x-a)+y-(x-a)? 


Then 
F(x) =a+B-(x-a)+7:(x-a)* F(a) =a = f(a) 
Pi(x) = B+2y: (x~2) F(a) = B= f(a) 
F"(x) = 2 P"(a)=2y=f'@) 


And from these we can clearly read off the values of a, 6B and y and so recover our function 
F(x). Additionally if we write things this way, then it is quite clear how to extend this to 
a cubic approximation and a quartic approximation and so on. 

Return to our example: 


31 Any polynomial of degree two can be written in this form. For example, when a = 1,34 2x +x? = , 


6+4(x—1)+(x—-1). 
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Example 3.4.7 


Use the quadratic approximation to estimate e°!. 


Solution. Set f(x) = e* and a = Oas before. 


e To form the quadratic approximation we need f(a), f’(a) and f"(a): 


e Then our quadratic approximation is 


F(x) = f(0)+xf'(0) + x2 "(0) = [os x 


2 2 
F(0.1) = 1.105 


Recall that e°! = 1.105170918..., so the quadratic approximation is quite accurate with 
very little effort. 


tC Example 3.4.7 __J 


Before we go on, let us first introduce (or revise) some notation that will make our 
discussion easier. 


» Whirlwind tour of summation notation 


In the remainder of this section we will frequently need to write sums involving a large 
number of terms. Writing out the summands explicitly can become quite impractical — 
for example, say we need the sum of the first 11 squares: 


1 69? 4-97 a? 5 Gt 7? et 4 94 10 12 


This becomes tedious. Where the pattern is clear, we will often skip the middle few terms 
and instead write 


1+27+4..-4+11?. 


A far more precise way to write this is using & (capital-sigma) notation. For example, we 
can write the above sum as 


11 

2 

k=1 
This is read as 


The sum from k equals 1 to 11 of k’. 


More generally 
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Notation 3.4.8. 


Let m < n be integers and let f (x) be a function defined on the integers. Then we 
write 


Similarly we write 


nN 
Di 4 


i=m 


to mean 
Am + Am41 + Aam42 + +++ + 4n-1 + 4n 


for some set of coefficients {@j,...,4n}. 


Consider the example 


It is important to note that the right hand side of this expression evaluates to a number”; 
it does not contain “k”. The summation index k is just a “dummy” variable and it does 
not have to be called k. For example 


Also the summation index has no meaning outside the sum. For example 
7 
1 
Kk: ol 
k=3 


has no mathematical meaning; It is gibberish”. 


32 Some careful addition shows it is 4918 \ 


176400" 
33 Or possibly gobbledygook. For a discussion of statements without meaning and why one should avoid 


them we recommend the book “Bendable learnings: the wisdom of modern management” by Don 
Watson. 
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3.4.4 » Still better approximations — Taylor polynomials 


We can use the same strategy to generate still better approximations by polynomials™ of 
any degree we like. As was the case with the approximations above, we determine the 
coefficients of the polynomial by requiring, that at the point x = a, the approximation and 
its first n derivatives agree with those of the original function. 


Rather than simply moving to a cubic polynomial, let us try to write things in a more 
general way. We will consider approximating the function f(x) using a polynomial, T;,(x), 
of degree n — where n is a non-negative integer. As we discussed above, the algebra is 
easier if we write 


Ty (x) = eg + e1(x —a@) + ¢0(x —a)* 4 + C(x — a)” 
nN 
= DS ex(x —a)* using & notation 
k=0 


The above form* °° makes it very easy to evaluate this polynomial and its derivatives at 


x = a. Before we proceed, we remind the reader of some notation (see Notation 2.2.8): 


e Let f(x) be a function and k be a positive integer. We can denote its k® derivative 
with respect to x by 


oil (2) 10 f(x 


Additionally we will need 


34 Polynomials are generally a good choice for an approximating function since they are so easy to work d 


with. Depending on the situation other families of functions may be more appropriate. For example 
if you are approximating a periodic function, then sums of sines and cosines might be a better choice; 
this leads to Fourier series. 

35 Any polynomial in x of degree n can also be expressed as a polynomial in (x — a) of the same degree n 
and vice versa. So T;,(x) really still is a polynomial of degree n. 

36 Furthermore when x is close to a, (x — a)‘ decreases very quickly as k increases, which often makes the 
“high k” terms in T;,(x) very small. This can be a considerable advantage when building up approxi- 

n 


mations by adding more and more terms. If we were to rewrite T,,(x) in the form s byx* the “high k” 


k=0 
terms would typically not be very small when x is close to a. 
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Let n be a positive integer”, , then n-factorial, denoted n!, is the product 


ni=nx(n—-1)x--- 


Futher, we use the convention that 


The first few factorials are 


Isl 
4! = 24 


Tr(x) = co +e1(x—a) +c2(x-a)? +c3(x—a)® 
The) Cy +2c9(x—a) +3c3(x —a)? 
Lik) = 2co +6c3(x — a) 
a) = 6c3 


x3x2x1 


i ee 


--+ NCpn(Xx 


a os Cn(x — 
—a 


--+ n(n—1)cy(x — a) 


ay? 
yet 


n—2 


1)(n —2)e,(x—a)"-? 


n(n 


n! Cy 


Now notice that when we substitute x = a into the above expressions only the constant 


terms survive and we get 


So now if we want to set the coefficients of T,,(x) so that it agrees with f(x) at x = a then 


we need 


Ty(a) = co = f(a) 


37 It is actually possible to define the factorial of positive real numbers and even negative numbers but it d 


requires more advanced calculus and is outside the scope of this course. The interested reader should 


look up the Gamma function. 
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We also want the first n derivatives of T;,(x) to agree with the derivatives of f(x) atx =a, 
sO 


Ty(a) =e1 = f(a) a= f(a) = Ff'(a) 
TH(a) = 2-02 = f"(a) c= 5h") = 5 F"(a) 


TH (a) = 6-c3 = f(a) = ai") =a f"@ 


More generally, making the k‘" derivatives agree at x = a requires : 


1 
Ty (a) = Kc = f(a) c= Gf (a) 
And finally the n'® derivative: 
n n 1 n 
TT! (a) =n! -cn = f(a) Cn = (") (a) 


Putting this all together we have 


Let us formalise this definition. 


Let a be a constant and let 1 be a non-negative integer. The n'" degree Taylor 
polynomial for f(x) about x = ais 


The special case a = 0 is called a Maclaurin*® polynomial. 


Before we proceed with some examples, a couple of remarks are in order. 


38 The polynomials are named after Brook Taylor who devised a general method for constructing them d 


in 1715. Slightly later, Colin Maclaurin made extensive use of the special case a = 0 (with attribution 
of the general case to Taylor) and it is now named after him. The special case of a = 0 was worked 
on previously by James Gregory and Isaac Newton, and some specific cases were known to the 14th 
century Indian mathematician Madhava of Sangamagrama. 
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e While we can compute a Taylor polynomial about any a-value (providing the deriva- 
tives exist), in order to be a useful approximation, we must be able to compute 
f(a), f'(a),..., f(a) easily. This means we must choose the point a with care. In- 
deed for many functions the choice a = 0 is very natural — hence the prominence of 
Maclaurin polynomials. 


e If we have computed the approximation T;,(x), then we can readily extend this to 
the next Taylor polynomial T,,+;(x) since 


1 


aah @) 


Ta (x) = Tn(x) + 


This is very useful if we discover that T,,(x) is an insufficient approximation, because 
then we can produce T,,,1(x) without having to start again from scratch. 


3.4.5 » Some examples 
Let us return to our running example of e*: 


co Example 3.4.12 


The constant, linear and quadratic approximations we used above were the first few 
Maclaurin polynomial approximations of e*. That is 


2 


x 
To(x) =1 Ty(x) =1+x T(x) =1tet > 


Since dex = e*, the Maclaurin polynomials are very easy to compute. Indeed this invari- 


ance under differentiation means that 


f°) (x) = e* n=0,1,2,... so 
1 


Thus we can write down the seventh Maclaurin polynomial very easily: 


6 7 


x be 
720 5040 


Also notice that if we use this to approximate the value of e! we obtain: 


1 1°11 1 1 
el = T7(1) =-1414+=-4+54+-—4 og Bom 
685 


= 559 = 2.718253968 ... 
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The true value of e is 2.718281828 . .., so the approximation has an error of about 3 x 10-°. 

Under the assumption that the accuracy of the approximation improves with n (an 
assumption we examine in Subsection 3.4.8 below) we can see that the approximation of e 
above can be improved by adding more and more terms. Indeed this is how the expression 


for e in equation (2.7.3) in Section 2.7 comes about. 
Example 34.12J 


Now that we have examined Maclaurin polynomials for e* we should take a look at log x. 
Notice that we cannot compute a Maclaurin polynomial for log x since it is not defined at 
x = 0. 


Example 3.4.13 


Compute the 5" Taylor polynomial for log x about x = 1. 


Solution. We have been told a = 1 and fifth degree, so we should start by writing down 
the function and its first five derivatives: 


f(x) =logx (1) =log1=0 
fiz) == ies 
POSS f= 
ro) =5 7) 2 
P@Q=s FO) =-6 
fas fO(1) = 24 
Substituting this into equation (3.4.10) gives 
T(x) =O0+1-(x—-1)4 > 1)-(x-1)?+ 5-2-(x-1)° 4 a 6) (x-1)8+ 5 -24-(x-1)8 
= (x- 1) — 5 (1)? + 5 (e-1)9 — Fr) + x)? 


Again, it is not too hard to generalise the above work to find the Taylor polynomial of 
degree n: With a little work one can show that 


Capi 3.4.13 __J 


For cosine: 


Example 3.4.14 


Find the 4th degree Maclaurin polynomial for cos x. 
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Solution. We have a = 0 and we need to find the first 4 derivatives of cos x. 


f(x) =cosx f(0) =1 
f(x) = —sinx f'(0) =0 
f" (x) = —cosx f"(0) =-1 
f"' (x) = sinx f"(0) =0 
f(x) = cosx f4 (0) =1 
Substituting this into equation (3.4.10) gives 
Ta(x) =141-(0)-x45- (A) 2242-0 Sto (1x! 
ao ae 
=too tq 


Notice that since the 4'" derivative of cos x is cos x again, we also have that the fifth deriva- 
tive is the same as the first derivative, and the sixth derivative is the same as the second 
derivative and so on. Hence the next four derivatives are 


fA (x) = cos x f%(0) =1 
fO (x) = —sin x f(0) =0 
f) (x) = —cos x f) (0) =-1 
f(x) = sinx f(0) =0 
f°) (x) = cos x f®) (0) =1 


Continuing this process gives us the 2n'* Maclaurin polynomial 


_4)k 
T(x) =), ne a 


k=0 


Warning 3.4.15. 


The above formula only works when x is measured in radians, because all of our 
derivative formulae for trig functions were developed under the assumption that 


angles are measured in radians. 


Below we plot cos x against its first few Maclaurin polynomial approximations: 
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1 — dz? 
es | COS & ae 


6 


~w]— in 114 win ti7t_ lb 
cosx & 1 ae + ae cosx = 1 yl + ait 


tC Example 34.15. 


The above work is quite easily recycled to get the Maclaurin polynomial for sine: 


Example 3.4.16 


Find the 5th degree Maclaurin polynomial for sin x. 


Solution. We could simply work as before and compute the first five derivatives of sin x. 
But set ¢(x) = sin x and notice that ¢(x) = —f’(x), where f(x) = cosx. Then we have 


g(0) = —f'(0) =0 

Oe A) oe 

g"(0) = -F"(0) = 0 

g'"(0) = ~f(0) = -1 
s(0) = -f9(0) =0 
g(0) = -f(0) =1 
Hence the required Maclaurin polynomial is 

© 

Ts(z) =2— art 


Just as we extended to the 2n'* Maclaurin polynomial for cosine, we can also extend our 
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work to compute the (2n + 1)** Maclaurin polynomial for sine: 


n 


Gy 2k+1 
loys (x) = > 77x 
2n+a(%) & (2k+ 1)! 


Warning 3.4.17. 


The above formula only works when x is measured in radians, because all of our 
derivative formulae for trig functions were developed under the assumption that 


angles are measured in radians. 


Below we plot sin x against its first few Maclaurin polynomial approximations. 


é ~~ _ 1,3 
sinx 2x Se a Soe 


5 7 


] aw) a ll 3 1 
smx YX — Fx + xe 


: Ao pe 8 eS 
snr yx — xx + xe ae 


Capi 3.4.17 __J 


To get an idea of how good these Taylor polynomials are at approximating sin and cos, 
let’s concentrate on sin x and consider x’s whose magnitude |x| < 1. There are tricks that 
you can employ”’ to evaluate sine and cosine at values of x outside this range. 


39 If you are writing software to evaluate sin x, you can always use the trig identity sin(x) = sin(x — 2nz7t), 
to easily restrict to |x| < 7. You can then use the trig identity sin(x) = —sin(x + 77) to reduce to |x| < 4. 
Finally you can use the trig identity sin(x) = ¥ cos(4 + x)) to reduce to |x| < | <1. 
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If || <1 radians”’, then the magnitudes of the successive terms in the Taylor polyno- 
mials for sin x are bounded by 

Ix i 

Z 


ROR 


1).13 1 
<i ai (X| < Toy © 0.0083 
< < ~ 0. 


< 5 
< ar © 0.000003 = |x|" 000000025 


|x 
| 


1 
4, © 0.0002 P 


\O| 


11! 


From these inequalities, and the graphs on the previous pages, it certainly looks like, for x 
not too large, even relatively low degree Taylor polynomials give very good approxima- 
tions. In Section 3.4.8 we'll see how to get rigorous error bounds on our Taylor polynomial 


approximations. 


3.4.6 >» Estimating change and Ax, Ay notation 


Suppose that we have two variables x and y that are related by y = f(x), for some func- 
tion f. One of the most important applications of calculus is to help us understand what 
happens to y when we make a small change in x. 


Notation 3.4.18. 


Let x,y be variables related by a function f. That is y = f(x). Then we denote 
a small change in the variable x by Ax (read as “delta x”). The corresponding 
small change in the variable y is denoted Ay (read as “delta y”). 


Ay = f(x+ Ax) - f(x) 


In many situations we do not need to compute Ay exactly and are instead happy with 
an approximation. Consider the following example. 


om Example 3.4.19 


Let x be the number of cars manufactured per week in some factory and let y the cost of 
manufacturing those x cars. Given that the factory currently produces a cars per week, 
we would like to estimate the increase in cost if we make a small change in the number of 
cars produced. 


Solution. We are told that a is the number of cars currently produced per week; the cost 
of production is then f(a). 


e Say the number of cars produced is changed from a to a + Ax (where Ax is some 
small number. 


e As x undergoes this change, the costs change from y = f(a) to f(a+ Ax). Hence 
Ay = f(a+ Ax) — f(a) 


40 Recall that the derivative formulae that we used to derive the Taylor polynomials are valid only when d 


x is in radians. The restriction —1 < x < 1 radians translates to angles bounded by 18° ~ 57°. 
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e We can estimate this change using a linear approximation. Substituting x = a + Ax 
into the equation (3.4.3) yields the approximation 


flat dx) ~ f(a) + fi(a)(a t+ dx —a) 
and consequently the approximation 
Ay = f(a+ Ax) — f(a) = fla) + fi(a)Ax — f(a) 


simplifies to the following neat estimate of Ay: 


Equation 3.4.20 (Linear approximation of Ay). 


Ay ~ f'(a)Ax 


e In the automobile manufacturing example, when the production level is a cars per 
week, increasing the production level by Ax will cost approximately f’(a)Ax. The 
additional cost per additional car, f’(a), is called the “marginal cost” of a car. 


e If we instead use the quadratic approximation (given by equation (3.4.6)) then we 
estimate 


f(at+ Ax) = f(a) + f'(a)Ax + $f" (a) Ax? 


and so 


Ay = f(a+ Ax) — f(a) ~ f(a) + fi(a)Ax + f"(a)Ax? — f(a) 


which simplifies to 


Equation 3.4.21 (Quadratic approximation of Ay). 


Ay ~ f'(a)Ax + 4f"(a)Ax* 


Capi 3.4.21 __J 


3.4.7 »» Further examples 


In this subsection we give further examples of computation and use of Taylor approxima- 
tions. 


(an Example 3.4.22 


Estimate tan 46°, using the constant-, linear- and quadratic-approximations (equations (3.4.1), 


(3.4.3) and (3.4.6)). 
Solution. Note that we need to be careful to translate angles measured in degrees to 
radians. 
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e Set f(x) = tanx, x = 46795 radians and a = 4573) = 7 radians. This is a good choice 
for a because 


- a = 45° is close to x = 46°. As noted above, it is generally the case that the 
closer x is to a, the better various approximations will be. 
— We know the values of all trig functions at 45°. 


e Now we need to compute f and its first two derivatives at x = a. It is a good time 
to recall the special 1 : 1: V2 triangle 


So 
f(a) = tanx f(m/4) =1 
/ 1 : 1 
f'(x) = sec? x = a ome 
eA 2 sue ' — 2/V2 


= TU fC 2 ‘ ‘ ‘ 
e As x —a = 46785 — 45785 = aq Fadians, the three approximations are 


f(a) = 
f(a) + f'(a)(x —a) =14+25%, = 1.034907 


F(x) = f(a) + f'(a)(x— a) + 4 f"(a)(x — a)? = 14-245) + 44H)? = 1.035516 


For comparison purposes, tan 46° really is 1.035530 to 6 decimal places. 


CS ______ Example 3.4.22 i) 


Warning 3.4.23. 


All of our derivative formulae for trig functions were developed under the as- 
sumption that angles are measured in radians. Those derivatives appeared in 


the approximation formulae that we used in Example 3.4.22, so we were obliged 
to express x — a in radians. 
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Example 3.4.24 


Suppose that you are ten meters from a vertical pole. You were contracted to measure the 
height of the pole. You can’t take it down or climb it. So you measure the angle subtended 
by the top of the pole. You measure @ = 30°, which gives 
== o_ 10 ~ 
h = 10 tan30° = 73 © 9.7m 

This is just standard trigonometry — if we know the angle exactly then we know the 
height exactly. 

However, in the “real world” angles are hard to measure with such precision. If the 
contract requires you the measurement of the pole to be accurate within 10 cm, how accu- 
rate does your measurement of the angle @ need to be? 


Solution. For simplicity*!, we are going to assume that the pole is perfectly straight and 
perfectly vertical and that your distance from the pole was exactly 10 m. 


e Write 6 = 6 + Aé where @ is the exact angle, 9 is the measured angle and A@é is the 
error. 


10 


e Similarly write h = ho + Ah, where h is the exact height and ho = i) 


height. Their difference, Ah, is the error. 
e Then 


is the computed 


ho = 10 tan 9 ho + Ah = 10 tan(@9 + A@) 
Ah = 10 tan(6) + A@) — 10 tan 69 


We could attempt to solve this equation for A@ in terms of Ah — but it is far simpler 
to approximate Ah using the linear approximation in equation 3.4.20. 


e To use equation 3.4.20, replace y with h, x with 6 and a with 69. Our function f(@) = 
10 tan 6 and 69 = 30° = 7/6 radians. Then 


Ay ~ f'(a)Ax becomes Ah = f'(@9)A0 


Since f(0) = 10tan 8, f’(@) = 10sec? 6 and 
(6) = 10sec2(m/6) =10.(2-) = 
f'(0) = 10 sec?(7t/6) = 10 ( =) -4 


e Putting things together gives 


40 
Ah = f'(09)A@ becomes Ah = =z M0 
We can then solve this equation for A@ in terms of Ah: 
3 
Aé@ = —Ah 
: 40 


41 Mathematicians love assumptions that let us tame the real world. ‘ 
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e Weare told that we must have |Ah| < 0.1, so we must have 
|A@| < —— 


This is measured in radians, so converting back to degrees 


3 180 
—_ 4 = 043° 
400 7 


tC Example 3.424) J 


Suppose that you measure, approximately, some quantity. Suppose that the exact 
value of that quantity is Qo and that your measurement yielded Qo + AQ. Then 
|AQ] is called the absolute error of the measurement and 100/22! is called the 
percentage error of the measurement. As an example, if the exact value is 4 and 
the measured value is 5, then the absolute error is |5 — 4| = 1 and the percentage 


error is 10054! = 25. That is, the error, 1, was 25% of the exact value, 4. 


Example 3.4.26 


Suppose that the radius of a sphere has been measured with a percentage error of at most 
e%. Find the corresponding approximate percentage errors in the surface area and volume 
of the sphere. 


Solution. We need to be careful in this problem to convert between absolute and percent- 
age errors correctly. 


e Suppose that the exact radius is r9 and that the measured radius is rp + Ar. 


e Then the absolute error in the measurement is |Ar| and, by definition, the percentage 


error is 100 1 We are told that 100/47! <e. 


A) 


e The surface area** of a sphere of radius r is A(r) = 47tr?. The error in the surface 
area computed with the measured radius is 


AA = A(ro + Ar) — A(10) © A’(10) Ar 
= 87tr9Ar 


where we have made use of the linear approximation, equation (3.4.20). 


e The corresponding percentage error is then 


/ 
DAL pp Arr! _ poq8770lATl _ 5. agg AT <2 
A(ro) A(r0) Arcr? ro 


42 We do not expect you to remember the surface areas of solids for this course. d 
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e The volume of a sphere* of radius r is V(r) = arr’. The error in the volume com- 
puted with the measured radius is 


AV = V(ro + Ar) — V(r9) = V'(r0) Ar 
= Arcr3Ar 
where we have again made use of the linear approximation, equation (3.4.20). 


e The corresponding percentage error is 
—— < 
ie 


AV V'(r9)A 
AVI 109 Vr A7l _ 499 =3x 100! | =e 


100 a 
V(10) V(10) Arr} /3 ro 


We have just computed an approximation to AV. This problem is actually sufficiently 
simple that we can compute AV exactly: 


AV = V(1o + Ar) — V(r9) = $70(ro + Ar)? — S7rG 


e Applying (a + b)> = a? + 3a*b + 3ab’ + b8 with a = rg and b = Ar, gives 


4 4 
V(ro + Ar) — V(10) = 37 "3 + 3r5Ar + 3r9 (Age + (Ar)>| = 370 


= $7[3rGAr + 379 (Ar)? + (Ar)? 


e Thus the difference between the exact error and the linear approximation to the error 
is obtained by retaining only the last two terms in the square brackets. This has 
magnitude 


37¢|3r9 (Ar)? + (Ar)?| = $7|3r0 + Ar|(Ar)? 


or in percentage terms 


1 Ar . Ar 
100 - ~—; - 47|3r0 (Ar)? + (Ar)5| = 100]3—- + = 
g/t "0 19 


2 3Ar Ar _ Ar 
ace yee 
<3¢ (755) (1+ ap5) 


Since ¢ is small, we can assume that 1 + 300 ~ 1. Hence the difference between the 
exact error and the linear approximation of the error is roughly a factor of 749 smaller 
than the linear approximation 3e. 


e As an aside, notice that if we argue that Ar is very small and so we can ignore terms 
involving (Ar)* and (Ar)? as being really really small, then we obtain 


V(ro + Ar) — V(1r9) = 47[3r3Ar +3r9 (Ar)? + (Ar)3] 
a) 
really really small 
we 4q¢-3r2Ar = 4rreAr 
3 0 0 


which is precisely the result of our linear approximation above. 


43 We do expect you to remember the formula for the volume of a sphere. d 
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Campi 3.4.26J 
co Example 3.4.27 


To compute the height of a lamp post, the length s of the shadow of a two meter pole is 
measured. The pole is 6 m from the lamp post. If the length of the shadow was measured 
to be 4 m, with an error of at most one cm, find the height of the lamp post and estimate 
the percentage error in the height. 


Solution. We should first draw a picture** 


e By similar triangles we see that 


Z h 


5 6+8 


from which we can isolate 1 as a function of s: 


y — 266+) _2 9 
S s 


e The length of the shadow was measured to be sy) = 4m. The corresponding height 
of the lamp post is 


12 
hg = —4+2=5 
0 Zr m 


e If the error in the measurement of the length of the shadow was As, then the exact 
shadow length was s = so + As and the exact lamp post height is h = f(so + As), 
where f(s) = 42 +2. The error in the computed lamp post height is 


Ah =h-—ho = f(so + As) — f (0) 


44 We get to reuse that nice lamp post picture from Example 3.2.4. d 


254 


APPLICATIONS OF DERIVATIVES 3.4 TAYLOR POLYNOMIALS 


e We can then make a linear approximation of this error using equation (3.4.20): 


Ah = f'(so)As = 


e We are told that |As| < 0 m. Consequently, approximately, 


121 3 
<Su=— 
ee 4210 40 
The percentage error is then approximately 
|Ah| =) 
100— < 100——~ = 1.5% 
ho 40x5 


tC Example 34271—I 


3.4.8 »» The error in the Taylor polynomial approximations 


Any time you make an approximation, it is desirable to have some idea of the size of the 
error you introduced. That is, we would like to know the difference R(x) between the 
original function f(x) and our approximation F(x): 


R(x) = f(x) — F(a). 
Of course if we know R(x) exactly, then we could recover f(x) = F(x) + R(x) — so this 
is an unrealistic hope. In practice we would simply like to bound R(x): 


[R(x)| = [f(x) — F(x)| <M 


where (hopefully) M is some small number. It is worth stressing that we do not need the 
tightest possible value of M, we just need a relatively easily computed M that isn’t too far 
off the true value of | f(x) — F(x)]. 

We will now develop a formula for the error introduced by the constant approxima- 
tion, equation (3.4.1) (developed back in Section 3.4.1) 


f(x) = f(a) = To(x) 0 Taylor polynomial 


The resulting formula can be used to get an upper bound on the size of the error |R(x)]. 
The main ingredient we will need is the Mean-Value Theorem (Theorem 2.13.4) — so 


we suggest you quickly revise it. Consider the following obvious statement: 


foSH=Ty) now some sneaky manipulations 
= fla) + (F(x) - f(a) 
= fla) +(F(x) - f(a))- = 
=To(x) 1 
To(x) f(x) — f(a) (x a) 


x—-Aa 
looks familiar 


Indeed, this equation is important in the discussion that follows, so we'll highlight it 
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Equation 3.4.28 (We will need it again soon). 


Hes = F(a) of (x a) 


The coefficient is the average slope of f(t) as t moves from t = a 


tot = x. We can picture this as the slope of the secant joining the points (a, f(a)) and 
(x, f(x)) in the sketch below. 


As t moves a to x, the instantaneous slope f’(t) keeps changing. Sometimes f’(t) might 


f(x)-f(a) 


be larger than the average slope —~— 


, and sometimes f’(f) might be smaller than the 
average slope ated However, by the Mean-Value Theorem (Theorem 2.13.4), there 
must be some number c, strictly between a and x, for which f’(c) = fo) ~Sta) - @ exactly. 


Substituting this into formula (3.4.28) gives 


Equation 3.4.29 (Towards the error). 


Fe) = Tala) (eles) for some c strictly between a and x 


Notice that this expression as it stands is not quite what we want. Let us massage this 
around a little more into a more useful form 


Equation 3.4.30 (The error in constant approximation). 


for some c strictly between a and x 


Notice that the MVT doesnt tell us the value of c, however we do know that it lies 
strictly between x and a. So if we can get a good bound on f’(c) on this interval then we 
can get a good bound on the error. 


Example 3.4.31 


Let us return to Example 3.4.2, and we’ll try to bound the error in our approximation 
of 2+, 
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e Recall that f(x) = e*, a =O and Tp(x) = e? = 1. 
e Then by equation (3.4.30) 


et 1)(01).= f'(e)* O1=—0) withO<c<0.1 


e Now f’(c) = e°, so we need to bound e° on (0,0.1). Since e° is an increasing function, 
we know that 


oa f(a when 0 <c < 0.1 
So one is tempted to write that 
je — To(0-1)| = IR) = [f’©)| (0-1-0) 
<e!.0.1 


And while this is true, it is rather circular. We have just bounded the error in our 


approximation of e”! by 7;e°! — if we actually knew e°! then we wouldn’t need to 


estimate it! 


e While we don’t know e®! exactly, we do know* that 1 = e° < e°! < e! < 3. This 
gives us 


|R(0.1)| <3 x0.1=03 


That is — the error in our approximation of e”! is no greater than 0.3. Recall that we 
don’t need the error exactly, we just need a good idea of how large it actually is. 


e In fact the real error here is 
je"! — Ty(0.1)| = Je° — 1] = 0.1051709... 
so we have over-estimated the error by a factor of 3. 


But we can actually go a little further here — we can bound the error above and below. 
If we do not take absolute values, then since 


eu = 1901) =F (ce) «0.1 and 1 < f'(c) <3 
we can write 
1x 0.1 < (e°! — Ty(0.1)) <3 x 0.1 


SO 


So while the upper bound is weak, the lower bound is quite tight. 
45 Oops! Do we really know that e < 3? We haven’t proved it. We will do so soon. d 
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Cape 3.4.31 —_ 


There are formulae similar to equation (3.4.29), that can be used to bound the error in 
our other approximations; all are based on generalisations of the MVT. The next one — 
for linear approximations — is 


f(x) = f(a) + f’(a)(x —a) +4f"(c)(x—a)* — for some c strictly between a and x 
a 


=T; (x) 


which we can rewrite in terms of T;(x): 


Equation 3.4.32 (The error in linear approximation). 


f(x) —Ti(x) = Ff"(c)(x -— a)? for some ¢ strictly between a and x 


It implies that the error that we make when we approximate f(x) by Ti(x) = f(a) + 
f'(a) (x — a) is exactly $f”(c) (x — a) for some c strictly between a and x. 


More generally 
F(x) = F(a) + f(a) (8A) +--+ FMA) (— 9)" +a FOE) (a) 


That is, the error introduced when f(x) is approximated by its Taylor polynomial of 
degree n, is precisely the last term of the Taylor polynomial of degree n + 1, but with the 
derivative evaluated at some point between a and x, rather than exactly at a. These error 
formulae are proven in the optional Section 3.4.9 later in this chapter. 


-— Example 3.4.34 


Approximate sin 46° using Taylor polynomials about a = 45°, and estimate the resulting 
error. 


Solution. 
e Start by defining f(x) = sinx and 


a=45° = 45 7gjradians x = 46° = 46 qggradians x-a= qsgradians 
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e The first few derivatives of f at a are 


, ul 
flx)=sinx f(a) =— 


2 
fiz)=cosz  fl@)= 
fitz) =-sinz f"(a) =-— 
F(x) = — cos xf (a) = - 
e The constant, linear and quadratic Taylor approximations for sin(x) about % are 
To(x) = f(a) == 
T(x) =Tolx) + F(a) (e-8) =G+a(e 


e So the approximations for sin 46° are 


sin 46° ~ To (Fa) = 5 = 0.70710678 
sin 46° ~ T; (Fa) = 5 = ( =) — 0.71944812 
sin 46° ~ T, (Fr - s 7 (=) + ap =). = 0.71934042 
e The errors in those approximations are (respectively) 
error in 0.70710678 = f'(c)(x — a) = cose: (=) 
error in 0.71944812 = sf") (xa)? = 7 sinc: es 
1 


1 T\3 
i =i £3) ec ene (<5) 
error in 0.71923272 ait (c)(x —a) ay 608° (Fe5 
In each of these three cases c must lie somewhere between 45° and 46°. 


e Rather than carefully estimating sinc and cosc for c in that range, we make use of a 
simpler (but much easier bound). No matter what c is, we know that | sinc| < 1 and 
|cosc| < 1. Hence 


error in 0.70710678] < (=) < 0.018 
TT 2 
error in 0.71944812| < im) < 0.00015 


error in 0.71934042| < 


: (=) < 0.0000009 
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tC Example 34341) 


Example 3.4.35 (Showing e < 3) 


In Example 3.4.31 above we used the fact that e < 3 without actually proving it. Let’s do 
so Now. 


e Consider the linear approximation of e* about a = 0. 
T,(x) = f(0) + f(0)-x=1+4x 


So at x = 1 we have 


er T; (1) —we 
e The error in this approximation is 
1 ef 
e* —Ty(x) = sf"(c) Soe 
So at x = 1 we have 

ef 

—7T,(1)= = 

e-Th(=$ 


where 0 <c < 1. 


e Now since e* is an increasing" function, it follows that e° < e. Hence 
ef 

e—T(1)=—= < 
11) =§ 


Moving the 5 to the left hand side and the T;(1) to the right hand side gives 


Soe <4. 


e This isn’t as tight as we would like — so now do the same with the quadratic ap- 
proximation with a = 0: 


xe 
etx Th(x) = 1+ e+ > 
So when x = 1 we have 
wets 
Oe ae: 


46 Since the derivative of e* is e* which is positive everywhere, the function is increasing everywhere. 
Pp yw 
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e The error in this approximation is 
ef 
a CD aif (c) x oe 
So at x = 1 we have 
e 


where 0 <c < 1. 


e Again since e* is an increasing function we have e° < e. Hence 


et 


H/o 


That is 
5 
— <7T,(1)= = 
(1) = 3 


So e < 3as required. 


tC Example 3.4.35 __J 


Example 3.4.36 (More on e*) 


We wrote down the general 1" degree Maclaurin polynomial approximation of e* in Ex- 
ample 3.4.12 above. 


e Recall that 


e The error in this approximation is (by equation (3.4.33)) 


1 
a Tn (x) = ao 
where c is some number between 0 and x. 
e So setting x = 1 in this gives 
1 


where 0 <c < 1. 


0 


e Since e* is an increasing function we know that 1 = e° < e° < e! < 3, so the above 


expression becomes 


1 1 2 
(+i)! <e-—T,(1) = ie < 
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e So when 1 = 9 we have 
: <e 1+14 : bese : < a 
10! 2 9! 10! 


e Now 1/10! < 3/10! < 10~°, so the approximation of e by 


eee ee eee ee 
eee 9! ot 36988 


= 2718251 s¢4 


is correct to 6 decimal places. 


e More generally we know that using T,,(1) to approximate e will have an error of at 


most iy — so it converges very quickly. 


Cape 3.4.36 __J 


Example 3.4.37 (Example 3.4.24 Revisited) 


Recall*” that in Example 3.4.24 (measuring the height of the pole), we used the linear ap- 
proximation 


F (8 + A@) = (80) + f"(00)A6 
with f(@) = 10tan@ and 6) = 30 to get 
Ah 
f'(90) 


While this procedure is fairly reliable, it did involve an approximation. So that you 
could not 100% guarantee to your client’s lawyer that an accuracy of 10 cm was 
achieved. 


Ah = f (09 + A@) — f (80) = f’(0@9)A@ which implies that A@ = 


On the other hand, if we use the exact formula (3.4.29), with the replacements x — 
69 + A@ and a — 4 


f(@0 + AO) = f (00) + f’(c)A@ for some c between 69 and 9 + Ad 
in place of the approximate formula (3.4.3), this legality is taken care of: 
Ah = f(@9 + A@) — f(@0) = f'(c)A@ — for some c between 6p and 69 + Ad 


We can clean this up a little more since in our example f’(@) = 10sec? 6. Thus for 
some c between 09 and 69 + Aé@: 


|Ah| = 10 sec*(c)|Ad| 


47 Now isa good time to go back and re-read it. ) 
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e Of course we do not know exactly what c is. But suppose that we know that the angle 
was somewhere between 25° and 35°. In other words suppose that, even though we 
don’t know precisely what our measurement error was, it was certainly no more 
than 5°. 


e Now on the range 25° < c < 35°, sec(c) is an increasing and positive function. Hence 
on this range 


1.217--- = sec? 25° < sec*c < sec? 35° = 1.490--- < 1.491 
So 


12.17 - |A@| < |Ah| = 10sec?(c) - |A@| < 14.91 - |Ad| 
e Since we require |Ah| < 0.1, we need 14.91|A@| < 0.1, that is 


0.1 


So we must measure angles with an accuracy of no less than 0.0067 radians — which 
is 


1 
- - 0.0067 = 0.38°. 


Hence a measurement error of 0.38° or less is acceptable. 


tC Example 3437. 


3.4.9 »» (optional) — Derivation of the error formulae 


In this section we will derive the formula for the error that we gave in equation (3.4.33) — 
namely 


Rus) = f(x) Tals) = Gay 


for some c strictly between a and x, and where T;,(x) is the n*" degree Taylor polynomial 
approximation of f(x) about x = a: 


Recall that we have already proved a special case of this formula for the constant approx- 
imation using the Mean-Value Theorem (Theorem 2.13.4). To prove the general case we 
need the following generalisation*® of that theorem: 


48 Itis nota terribly creative name for the generalisation, but it is an accurate one. d 
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Theorem 3.4.38 (Generalised Mean-Value Theorem). 


Let the functions F(x) and G(x) both be defined and continuous ona < x < b 
and both be differentiable on a < x < b. Furthermore, suppose that G’(x) 4 0 for 
all a < x < b. Then, there is a number c obeying a < c < b such that 


Notice that setting G(x) = x recovers the original Mean-Value Theorem. It turns out 
that this theorem is not too difficult to prove from the MVT using some sneaky algebraic 
manipulations: 


Proof. — First we construct a new function h(x) as a linear combination of F(x) and 
G(x) so that h(a) = h(b) = 0. Some experimentation yields 


e Since h(a) = h(b) = 0, the Mean-Value theorem (actually Rolle’s theorem) tells us 
that there is a number c obeying a < c < b such that h’(c) = 0: 


h'(x) = [F(b) — F(a)] -G'(x) — [G(b) — G(a)] - F'(x) SO 
0 = [F(b) — F(a)] - G'(c) — [G(b) — G(a)] - F’(c) 
Now move the G’(c) terms to one side and the F’(c) terms to the other: 


[F(b) — F(a)] -G'(c) = [G(b) — G(a)] - Fe). 


e Since we have G’(x) 4 0, we know that G/(c) 4 0. Further the Mean-Value theorem 
ensures”? that G(a) # G(b). Hence we can move terms about to get 


(Fb) - F@)] = (6) - 6] FS 


as required. 


Armed with the above theorem we can now move on to the proof of the Taylor remain- 
der formula. 


Proof of equation (3.4.33). We begin by proving the remainder formula for n = 1. That is 
1 
f(x) — T(x) = 5f"(0)- (xa)? 
49 Otherwise if G(a) = G(b) the MVT tells us that there is some point c between a and D so that G’(c) = 0. d 
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e Start by setting 


P(x) = f(x) — Tia) G(x) = (x-a)? 
Notice that, since T;(a) = f(a) and T;(x) = f'(a), 
Fa 0 G(a) =0 
F(x) = f'(x) -f'(@) Gi(x) = 2(x—a) 
e Now apply the generalised MVT with b = x: there exists a point q between a and x 
such that 
P(x) —F(a) _ F'(q) 
G(x)—G(a) Gq) 
P(x)—0 _ f'(q) —f'@) 
G(x) —0 2(q —a) 
>. Fe) _ fq) -f@ 
G(x) q—a 


e Consider the right-hand side of the above equation and set ¢(x) = f’(x). Then we 


have the term ee — this is exactly the form needed to apply the MVT. So now 


apply the standard MVT to the right-hand side of the above equation — there is 
some c between q and a so that 


q-a@  q-a 
Notice that here we have assumed that f” (x) exists. 


e Putting this together we have that 


FR) _f£@=-FO _ px, 
2 G(x) = g—a ii ( ) 
f(x) = Ti(x) 


as required. 


Oof! We have now proved the cases n = 1 (and we did n = 0 earlier). 

To proceed — assume we have proved our result for m = 1,2,...,k. We realise that we 
haven’t done this yet, but bear with us. Using that assumption we will prove the result is 
true for n = k +1. Once we have done that, then 


e we have proved the result is true for n = 1, and 


e we have shown if the result is true for n = k then it is true forn =k+1 
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Hence it must be true for all n > 1. This style of proof is called mathematical induction. 
You can think of the process as something like climbing a ladder: 

e prove that you can get onto the ladder (the result is true for n = 1), and 


e if lcan stand on the current rung, then I can step up to the next rung (if the result is 
true for n = k then it is also true for n = k + 1) 


Hence I can climb as high as like. 


e Let k > 0 and assume we have proved 


F(x) ~ Te) = ERGO) a) 
for some c between a and x. 
e Now set 
P(x) = f(x) — Tra (*) G(x) = (x- a)" 
and notice that, since Ty,1(a) = f(a), 
F(a) = f(a)—Tesr(a)=0 Gla) =0 G(x) = (k+ (ea)! 
and apply the generalised MVT with b = x: hence there exists a q between a and x 
so that 
P(x) —F(a) _ F'(q) 
E@)=E@) a) which becomes 
eee F@) rearrange 
(x— aT +1) Gay ° 
_ q)k+1 
Fa) = Genco PO 
e We now examine F’(q). First carefully differentiate F(x): 
P(x) =F |f) = (F@ + =a) + SPA)? ++ Fa) 
=F) = (FO + 5-0) + SIA a)? + FF \(e— ah) 


=F) - (FD + FO — 0) + SFO) = 0) ++ Gama FO 9) 


Now notice that if we set f’(x) = g(x) then this becomes 


F(x) = gl) — (sla) +8) ( =a) + 52" (=a)? ++ Ge aays M@)lx—a)) 


So F’(x) is then exactly the remainder formula but for a degree k — 1 approximation 
to the function g(x) = f’(x). 


266 


APPLICATIONS OF DERIVATIVES 3.5 OPTIMISATION 


e Hence the function F’(q) is the remainder when we approximate f’(q) with a degree 
k — 1 Taylor polynomial. The remainder formula, equation (3.4.33), then tells us that 
there is anumber c between a and q so that 


F(a) = (a) - (s(@) + 8'@)(q—a) + 58" @)(q— a)? ++ ETS @(q—a)) 
= 8 O(a -aF = FFM) (q-a) 


Notice that here we have assumed that f+!) (x) exists. 


e Now substitute this back into our equation above 


(x _ i / 
= ED Gar 
(x-a)F* 1 


as required. 


So we now know that 


e if, for some k, the remainder formula (with n = k) is true for all k times differentiable 
functions, 


e then the remainder formula is true (with n = k + 1) for all k +1 times differentiable 
functions. 


Repeatedly applying this for k = 1,2,3,4,--- (and recalling that we have shown the re- 
mainder formula is true when n = 0,1) gives equation (3.4.33) for all n = 0,1,2,.... 


3.5 « Optimisation 


One important application of differential calculus is to find the maximum (or minimum) 
value of a function. This often finds real world applications in problems such as the fol- 
lowing. 


Example 3.5.1 


A farmer has 400m of fencing materials. What is the largest rectangular paddock that can 
be enclosed? 


Solution. We will describe a general approach to these sorts of problems in Sections 3.5.2 
and 3.5.3 below, but here we can take a stab at starting the problem. 
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e Begin by defining variables and their units (more generally we might draw a picture 
too); let the dimensions of the paddock be x by y metres. 


e The area enclosed is then Am? where 
A=x-y 


At this stage we cannot apply the calculus we have developed since the area is a 
function of two variables and we only know how to work with functions of a single 
variable. We need to eliminate one variable. 


e We know that the perimeter of the rectangle (and hence the dimensions x and y) are 
constrained by the amount of fencing materials the farmer has to hand: 


2x + 2y < 400 
and so we have 
y <200-—x 


Clearly the area of the paddock is maximised when we use all the fencing possible, 
sO 


y = 200—-x 


e Now substitute this back into our expression for the area 
A =x- (200 -x) 


Since the area cannot be negative (and our lengths x,y cannot be negative either), 
we must also have 


0<x < 200 


e Thus the question of the largest paddock enclosed becomes the problem of finding 
the maximum value of 


A = x- (200 —x) subject to the constraint 0 < x < 200. 


— Example 3.5.1 — 


The above example is sufficiently simple that we can likely determine the answer by sev- 
eral different methods. In generaly we will need more systematic methods for solving 
problems of the form 


Find the maximum value of y = f(x) subjecttoa<x<b 


To do this we need to examine what a function looks like near its maximum and minimum 
values. 
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3.5.1 »» Local and global maxima and minima 
We start by asking: 


Suppose that the maximum (or minimum) value of f(x) is f(c) then what does 
that tell us about c? 


Notice that we have not yet made the ideas of maximum and minimum very precise. 
For the moment think of maximum as “the biggest value” and minimum as “the smallest 
value”. 


Warning 3.5.2. 


It is important to distinguish between “the smallest value” and “the smallest 
magnitude”. For example, because 


—5<—I1 


the number —5 is smaller than —1. But the magnitude of —1, which is | — 1] = 1, 
is smaller than the magnitude of —5, which is |—5| = 5. Thus the smallest 
number in the set {—1, —5} is —5, while the number in the set {—1, —5} that has 
the smallest magnitude is —1. 


Now back to thinking about what happens around a maximum. Suppose that the max- 
imum value of f(x) is f(c), then for all “nearby” points, the function should be smaller. 


Consider the derivative of f’(c): 


10) — lin LOC +) = FCO) 
f (ce) = lim h 


h-0 


Split the above limit into the left and right limits: 
e Consider points to the right of x = c, then for all h > 0, 


f(cth) < f(c) which implies that 
f(c+h) — f(c) <0 which also implies 
Heya) <0 since ee = negative. 
h positive 
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But now if we squeeze h — 0 we get 


nyt TOA 29 


h—>0+ h - 


(provided the limit exists). 


e Consider points to the left of x = c, then for all h < 0, 


f(cth) < f(c) which implies that 
f(c+h) — f(c) <0 which also implies 
hy) — 
per aie >0 since eee = positive. 
h negative 


But now if we squeeze ht — 0 we get 


(provided the limit exists). 


e So if the derivative f’(c) exists, then the above right- and left-hand limits must agree, 
which forces f’(c) = 0. 


Thus we can conclude that 
If the maximum value of f(x) is f(c) and f’(c) exists, then f’(c) = 0. 
Using similar reasoning one can also see that 
If the minimum value of f(x) is f(c) and f’(c) exists, then f’(c) = 0. 
Notice two things about the above reasoning: 


e Firstly, in order for the argument to work we only need that f(x) < f(c) for x close 
to c — it does not matter what happens for x values far from c. 


e Secondly, in the above argument we needed to consider f(x) for x both to the left of 
and to the right of c. If the function f(x) is defined on a closed interval [a,b], then the 
above argument only applies when a < c < b —not when c is either of the endpoints 
a and b. 


Consider the function below 


maxima 


x 
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This function has only 1 maximum value (the middle green point in the graph) and 1 
minimum value (the rightmost blue point), however it has 4 points at which the derivative 
is zero. In the small intervals around those points where the derivative is zero, we can see 
that function is locally a maximum or minimum, even if it is not the global maximum or 
minimum. We clearly need to be more careful distinguishing between these cases. 


Let a < b and let the function f(x) be defined for all x € [a,b]. Now leta<c <b, 
then 


e We say that f(x) has a global (or absolute) minimum at x = cif f(x) > f(c) 
for alla = 7 = Bb. 


e Similarly, we say that f(x) has a global (or absolute) maximum at x = c if 
fe) (ce) terally = x, 


Now let a < c < b (note the strict inequalities), then 


e We say that f(x) has a local minimum at x = c if there are a’ and b’ obeying 
a<a'<c<b' <bsuch that f(x) > f(c) for all x obeying a’ < x < b’. Note 
the strict inequalities in a’ < c < D’. 


e Similarly, we say that f(x) has a local maximum at x = c if there are a’ and 
b’ obeying a < a’ < c < DW! < b such that f(x) < f(c) for all x obeying 
a’ <x <b’. Note the strict inequalities in a’ < c < b’. 


The global maxima and minima of a function are called the global extrema of the 
function, while the local maxima and minima are called the local extrema. 


Consider again the function we showed in the figure above 


global maximum 
“local maxima 


local minima ~” 
global minimum ~” 


It has 2 local maxima and 2 local minima. The global maximum occurs at the middle green 
point (which is also a local maximum), while the global minimum occurs at the rightmost 
blue point (which is not a local minimum). 

Using the above definition we can summarise what we have learned above as the 
following theorem: 
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Theorem 3.5.4. 


If a function f(x) has a local maximum or local minimum at x = c and if f’(c) 


exists, then f’(c) = 0. 


It is often the case that, when f(x) has a local maximum at x = c, the function f(x) 
increases as x approaches c from the left and decreases as x leaves c to the right. That is, 
f'(x) > 0 for x just to the left of c and f’(x) < 0 for x just to the right of c. Similarly, it is 
often the case that, when f(x) has a local minimum at x = c, f’(x) < 0 for x just to the 
left of cand f’(x) > 0 for x just to the right of c. Theorem 3.5.4 says that, when f(x) has a 
local maximum or minimum at x = c, there are two possibilities. 


e The derivative f’(c) = 0. This case is illustrated in the following figure. 


Observe that, in this example, f’(x) changes continuously from negative to positive 
at the local minimum, taking the value zero at the local minimum (the red dot). 


e The derivative f’(c) does not exist. This case is illustrated in the following figure. 


Observe that, in this example, f’(x) changes discontinuously from negative to posi- 
tive at the local minimum (x = 0) and f’(0) does not exist. 
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This theorem demonstrates that the points at which the derivative is zero or does not exist 
are very important. It simplifies the discussion that follows if we give these points names. 


Let f(x) be a function and let c be a point in its domain. Then 


e if f’(c) exists and is zero we call x = c a critical point of the function, and 


e if f’(c) does not exist then we call x = ca singular point of the function. 


Warning 3.5.6. 


Note that some people (and texts) will combine both of these cases and call x = c 
a critical point when either the derivative is zero or does not exist. The reader 
should be aware of the lack of convention on this point® and should be careful to 


understand whether the more inclusive definition of critical point is being used, 
or if the text is using the more precise definition that distinguishes critical and 
singular points. 


We'll now look at a few simple examples involving local maxima and minima, critical 
points and singular points. Then we will move on to global maxima and minima. 


Example 3.5.7 


In this example, we'll look for local maxima and minima of the function f(x) = x? — 6x 
on the interval —2 < x <3. 


e First compute the derivative 
f' @)=3x" 6. 


Since this is a polynomial it is defined everywhere on the domain and so there will 
not be any singular points. So we now look for critical points. 


e Todoso we look for zeroes of the derivative 
fi (x) = 3x* — 6 = 3(x? — 2) = 3(x — V2)(x + V2). 


This derivative takes the value 0 at two different values of x. Namely x = c_ = a2 
and x = c, = V2. Here is a sketch of the graph of f(x). 


50 No pun intended. pf 
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From the figure we see that 

— f(x) has a local minimum at x = c+ (i.e. we have f(x) > f(c+) whenever x is 
close to c+) and 

- f(x) has a local maximum at x = c_ (i.e. we have f(x) < f(c_) whenever x is 
close to c_) and 

— the global minimum of f(x), for x in the interval —2 < x <3,isatx =c (ie. 
we have f(x) > f(c+) whenever —2 < x < 3) and 

- the global maximum of f(x), for x in the interval —2 < x < 3, is atx = 3 (ie. 
we have f(x) < f(3) whenever —2 < x < 3). 


e Note that we have carefully constructed this example to illustrate that the global 
maximum (or minimum) of a function on an interval may or may not also be a local 
maximum (or minimum) of the function. 


MG 357} J 


Example 3.5.8 


In this example, we'll look for local maxima and minima of the function f(x) = x? on the 
interval —1 <x <1. 


e First compute the derivative: 
f@)=or 


Again, this is a polynomial and so defined on all of the domain. The function will 
not have singular points, but may have critical points. 


e The derivative is zero only when x = 0, so x = c = O is the only critical point of the 
function. 


e The graph of f(x) is sketched below. From that sketch we see that f(x) has neither a 
local maximum nor a local minimum at x = c despite the fact that f’(c) = 0 — we 
have f(x) < f(c) =0 for all x <c =Oand f(x) > f(c) = 0 forallx >c=0. 
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e Note that this example has been constructed to illustrate that a critical point (or 
singular point) of a function need not be a local maximum or minimum for the function. 


e Reread Theorem 3.5.4. It says”! that, “if f(x) has a local maximum/minimum at 
x = c and if f is differentiable at x = c, then f’(c) = 0”. It does not say that “if 
f'(c) = 0 then f has a local maximum/minimum at x = c”. 


Co __________4 Example 3.5.8 __J 


Example 3.5.9 


In this example, we'll look for local maxima and minima of the function 


ifx >0 
ifx <0 


f(x) = |x1 = {" ; 


on the interval —1 < x <1. 


e Again, start by computing the derivative (reread Example 2.2.10): 


1 ne > 
f'(x) = < undefined if x =0 
—1 ify 0 


e This derivative never takes the value 0, so the function does not have any critical 
points. However the derivative does not exist at the point x = 0, so that point is a 
singular point. 


e Here is a sketch of the graph of f(x). 


51 A very common error of logic that people make is “Affirming the consequent”. “If P then Q” is true, d 


then observe Q and conclude P — this is false. “If he is Shakespeare then he is dead” “That man is 
dead” “He must be Shakespeare”. Or you may have also seen someone use this reasoning: “If a person 
is a genius before their time then they are misunderstood.” “I am misunderstood” “So I must be a 
genius before my time.” 
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From the figure we see that f(x) has a local (and in fact global) minimum at x = 0 
despite the fact that f’(0) is not a critical point. 


e Reread Theorem 3.5.4 yet again. It says that, “if f(x) has a local maximum/minimum 
at x = cand if f is differentiable at x = c, then f'(c) = 0”. It says nothing about what 
happens at points where the derivative does not exist. Indeed that is why we have 
to consider both critical points and singular points when we look for maxima and 
minima. 


Cape 359| JI 


3.5.2 » Finding global maxima and minima 


We now have a technique for finding local maxima and minima — just look for values of 
x for which either f’(x) = 0 or f’(x) does not exist. What about finding global maxima 
and minima? Let’s again consider the question 


Suppose that the maximum (or minimum) value of f(x), fora < x <b, is f(c). 
What does that tell us about c? 


If c obeys a < c < b (note the strict inequalities), then f has a local maximum (or minimum) 
at x = c and Theorem 3.5.4 tells us that either f’(c) = 0 or f’(c) does not exist. The only 
other place that a maximum or minimum can occur are at the ends of the interval. We can 
summarise this as: 


Theorem 3.5.10. 


If f(x) has a global maximum or global minimum, for a < x < b, at x = c then 
there are 3 possibilities. Either 


e f'(c) =0,or 
e f'(c) does not exist, or 


ec=aorc=b. 


That is, a global maximum or minimum must occur either at a critical point, a 
singular point or at the endpoints of the interval. 


276 


APPLICATIONS OF DERIVATIVES 3.5 OPTIMISATION 


This theorem provides the basis for a method to find the maximum and minimum 
values of f(x) fora<x <b: 


Corollary 3.5.11. 


Let f(x) be a continuous function on the interval a < x < b. Then to find the 
global maximum and minimum of the function: 


e Make a list of all values of c, with a < c < b, for which 
= f'(c)=0, or 
— f'(c) does not exist, or 


-c=aorc=b. 


That is — compute the function at all the critical points, singular points, 
and endpoints. 


e Evaluate f(c) for each c in that list. The largest (or smallest) of those values 
is the largest (or smallest) value of f(x) fora <x <b. 


Let’s now demonstrate how to use this strategy. The function in this first example is 
not too simple — but it is a good example of a function that contains both a singular point 
and a critical point. 


Example 3.5.12 


Find the largest and smallest values of the function f(x) = 2x°/3 + 3x?/9 for -1 <x <1. 


Solution. We will apply the method in Corollary 3.5.11. It is perhaps easiest to find the 
values at the endpoints of the intervals and then move on to the values at any critical or 
singular points. 


e Before we get into things, notice that we can rewrite the function by factoring it: 
fajpSste? foe" Sas (275-8) 
e Let’s compute the function at the endpoints of the interval: 


fl)=24+3=5 
fea 2a ee eed 


e To compute the function at the critical and singular points we first need to find the 
derivative: 
5 2 
f(x) = 2 . 2/3 + 3 . fy -l/3 
3 5 
10 
_ =e 4.9x71/3 


10x +6 
x1/3 
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e Notice that the numerator and denominator are defined for all x. The only place the 


derivative is undefined is when the denominator is zero. Hence the only singular 
point is at x = 0. The corresponding function value is 


f(0) =0 


e To find the critical points we need to solve f’(x) = 0: 


f= 10x + 6 
— xl/3 


Hence we must have 10x = —6 or x = —3/5. The corresponding function value is 


Ac x7/3 . (2x43) 


f(-3/5) = (-3/5)?. (2: = +3) 


—(9\'8 6415 
nee 5 


: ON. as 
~ \ 95 5 


Note that if we do not want to approximate the root (if, for example, we do not have 
a calculator handy), then we can also write 


ene en | 


~ A\25 25 
4/3 
ee 
25 


Since 0 < 9/25 < 1, we know that 0 < (£)*”” 


recall this from above, then 


< 1, and hence 


0 < f(-3/5) =5. (Z)° ao, 


e Wesummarise our work in this table 


3 
Cc 5 0 = 1 
type | critical point | singular point | endpoint | endpoint 
f(c) || 2a/se © 1.28 0 1 5 


e The largest value of f in the table is 5 and the smallest value of f in the table is 0. 


e Thus on the interval —1 < x < 1 the global maximum of f is 5, and is taken at x = 1, 
while the global minimum value of f(x) is 0, and is taken at x = 0. 
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e For completeness we also sketch the graph of this function on the same interval. 


y = f(a) = 20° + 30°79 


Later (in Section 3.6) we will see how to construct such a sketch without using a 
calculator or computer. 


Campi 3.5.12 __J 


3.5.3 »» Max/min examples 


As noted at the beginning of this section, the problem of finding maxima and minima is 
a very important application of differential calculus in the real world. We now turn to a 
number of examples of this process. But to guide the reader we will describe a general 
procedure to follow for these problems. 


(1) Read — read the problem carefully. Work out what information is given in the state- 
ment of the problem and what we are being asked to compute. 


(2) Diagram — draw a diagram. This will typically help you to identify what you know 
about the problem and what quantities you need to work out. 


(3) Variables — assign variables to the quantities in the problem along with their units. It 
is typically a good idea to make sensible choices of variable names: A for area, h for 
height, t for time etc. 


(4) Relations — find relations between the variables. By now you should know the quan- 
tity we are interested in (the one we want to maximise or minimise) and we need to 
establish a relation between it and the other variables. 


(5) Reduce — the relation down to a function of one variable. In order to apply the cal- 
culus we know, we must have a function of a single variable. To do this we need to 
use all the information we have to eliminate variables. We should also work out the 
domain of the resulting function. 


(6) Maximise or minimise — we can now apply the methods of Corollary 3.5.11 to find 
the maximum or minimum of the quantity we need (as the problem dictates). 
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(7) Be careful — make sure your answer makes sense. Make sure quantities are physical. 
For example, lengths and areas cannot be negative. 


(8) Answer the question — be sure your answer really answers the question asked in the 
problem. 


Let us start with a relatively simple problem: 


im Example 3.5.13 


A closed rectangular container with a square base is to be made from two different mate- 
rials. The material for the base costs $5 per square meter, while the material for the other 
five sides costs $1 per square meter. Find the dimensions of the container which has the 
largest possible volume if the total cost of materials is $72. 


Solution. We can follow the steps we outlined above to find the solution. 


e We need to determine the area of the two types of materials used and the corre- 
sponding total cost. 


e Draw a picture of the box. 


The more useful picture is the unfolded box on the right. 


e In the picture we have already introduced two variables. The square base has side- 
length b metres and it has height h metres. Let the area of the base be Ay and the 
area of the other fives sides be A; (both in m?), and the total cost be C (in dollars). 
Finally let the volume enclosed be Vim. 


e Some simple geometry tells us that 


A, =v? 
As = 4bh + b? 
V=bh 


C=5-A,+1-A, = 5b* + 4bh + b* = 6b? +. 4bh. 
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e To eliminate one of the variables we use the fact that the total cost is $72. 


C = 6b* + 4bh = 72 rearrange 
Abh = 72 — 6b* isolate h 
ie. a a ea 
4b 2 b 


Substituting this into the volume gives 


V=0h= ez — b*) = 18b— O° 


Now note that since D is a length it cannot be negative, so b > 0. Further since 
volume cannot be negative, we must also have 


ig 0 
and so b < /12. 


e Now we can apply Corollary 3.5.11 on the above expression for the volume with 
0 <b < V/12. The endpoints give: 


V(0) =0 
V(v12) =0 
The derivative is 
2 
V'(b) = 18- a 


Since this is a polynomial there are no singular points. However we can solve 
V'(b) = 0 to find critical points: 


2 
18 — a =0 divide by 9 and multiply by 2 
4—b°=0 


Hence b = +2. Thus the only critical point in the domain is b = 2. The correspond- 
ing volume is 


V(2) = 18x25 x2" 
= 36-12 = 24. 
So by Corollary 3.5.11, the maximum volume is when 24 when b = 2 and 


— pp? = 
,-°.2 bf _ 312 * 6 
ye b 2 2 


e All our quantities make sense; lengths, areas and volumes are all non-negative. 
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e Checking the question again, we see that we are asked for the dimensions of the 
container (rather than its volume) so we can answer with 


The container with dimensions 2 x 2 x 6m will be the largest possible. 


Campi 35.13} J 
qu eass: 3.5.14 


A rectangular sheet of cardboard is 6 inches by 9 inches. Four identical squares are cut 
from the corners of the cardboard, as shown in the figure below, and the remaining piece 
is folded into an open rectangular box. What should the size of the cut out squares be in 
order to maximize the volume of the box? 


Solution. This one is quite similar to the previous one, so we perhaps don’t need to go 
into so much detail. 


e After reading carefully we produce the following picture: 


e Let the height of the box be x inches, and the base be @ x w inches. The volume of 
the box is then V cubic inches. 


e Some simple geometry tells us that 0 = 9 — 2x,w = 6 — 2x and so 


V = x(9 — 2x) (6 — 2x)cubic inches 
= 54x — 30x? + 42°. 


Notice that since all lengths must be non-negative, we must have 
x,t,w >0 
andso0 <x <3 (if x >3 then w < 0). 
e We can now apply Corollary 3.5.11. First the endpoints of the interval give 


V(0) =0 V(3) =0 
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The derivative is 


V'(x) = 54 — 60x + 12x? 
= 6(9 — 10x + 2x?) 


Since this is a polynomial there are no singular points. To find critical points we 
solve V’(x) = 0 to get 


10+./100-4x2x9 

= 4 

10+/28 1042/7 547 
4 > 4 .  o 


We can then use a calculator to approximate 
x4 © 3.82 x ©1118. 
So x_ is inside the domain, while x+ lies outside. 


Alternatively**, we can bound x by first noting that 2 < /7 < 3. From this we 


know that 
=5 5-V7 5-2 
1 = — <x_= < ae 
, 2 2 ? 
542 Ba.a/7 . 5-23 
7 = —— i — =< a 
3.5 5) X4 5) 5) 4 


e Since the volume is zero when x = 0,3, it must be the case that the volume is max- 


imised when x = x_ = a. 


e Notice that since 0 < x_ < 3 we know that the other lengths are positive, so our 
answer makes sense. Further, the question only asks for the length x and not the 
resulting volume so we have answered the question. 


tC Example 3.5. oo) 


There is a new wrinkle in the next two examples. Each involves finding the minimum 
value of a function f(x) with x running over all real numbers, rather than just over a finite 
interval as in Corollary 3.5.11. Both in Example 3.5.16 and in Example 3.5.17 the function 
f(x) tends to +00 as x tends to either +00 or —0. So the minimum value of f(x) will be 
achieved for some finite value of x, which will be a local minimum as well as a global 
minimum. 


52. Say if we do not have a calculator to hand, or your instructor insists that the problem be done without n 


one. 
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Theorem 3.5.15. 


Let f(x) be defined and continuous for all —21 < x < . Let c bea finite real 
number. 


(a) If lim f(x) = +0 and lim f(x) = +o and if f(x) has a global minimum 
X— +00 a 
at x = c, then there are 2 possibilities. Either 
e f'(c) =0,or 
e f'(c) does not exist 


That is, a global minimum must occur either at a critical point or at a singular 
point. 


(b) If lim f(x) = —c# and lim f(x) = —w and if f(x) has a global maximum 
X— +00 aS 


at x = c, then there are 2 possibilities. Either 
© | (e)— 0 or 
e f'(c) does not exist 


That is, a global maximum must occur either at a critical point or at a singular 
point. 


Example 3.5.16 


Find the point on the line y = 6 — 3x that is closest to the point (7,5). 


Solution. In this problem 


e Asimple picture 


e Some notation is already given to us. Let a point on the line have coordinates (x, y), 
and we do not need units. And let @ be the distance from the point (x,y) to the 
point (7,5). 
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e Since the points are on the line the coordinates (x, y) must obey 
y =6-3x 
Notice that x and y have no further constraints. The distance ¢ is given by 


CaG=77ey=5) 


e We can now eliminate the variable y: 
2 = (x-77 + (y-5)? 


=(¢-7) 4 (6-—3x-—5) == 7) HI - 3x) 
= x7 — 14x +49 +1 - 6x + 9x* = 10x* — 20x + 50 


= V10-V/x2-2x+5 


Notice that as x — +o the distance & — +c. 
e We can now apply Theorem 3.5.15 


— Since the distance is defined for all real x, we do not have to check the endpoints 
of the domain — there are none. 


— Form the derivative: 


dl 2x2 
Vv 10 
dx 2W/x2—2x +5 


It is zero when x = 1, and undefined if x2 — 2x + 5 < 0. However, since 


= 2 SGP 2 1) ed = 1) 
—SS— 
>0 


we know that x* — 2x +5 > 4. Thus the function has no singular points and the 
only critical point occurs at x = 1. The corresponding function value is then 


€(1) =10V1—2+5 = 10V4 = 20. 
-— Thus the minimum value of the distance is ¢ = 20 and occurs at x = 1. 
e This answer makes sense — the distance is not negative. 


e The question asks for the point that minimises the distance, not that minimum dis- 
tance. Hence the answer is x = 1,y = 6—3 =3. ie 


The point that minimises the distance is (1,3). 
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Notice that we can make the analysis easier by observing that the point that minimises 
the distance also minimises the squared-distance. So that instead of minimising the func- 
tion @, we can just minimise (7: 


P=10(e =24+5) 


The resulting algebra is a bit easier and we don’t have to hunt for singular points. 
Example 3.5.16 


Example 3.5.17 


Find the minimum distance from (2,0) to the curve y? = x7 +1. 


Solution. This is very much like the previous question. 


e After reading the problem carefully we can draw a picture 


e In this problem we do not need units and the variables x, y are supplied. We define 
the distance to be ¢ and it is given by 


C= (x-2)?+y’. 


As noted in the previous problem, we will minimise the squared-distance since that 
also minimises the distance. 


e Since x,y satisfy y* = x? + 1, we can write the distance as a function of x: 


P=(9=2 tea G2 oS) 


Notice that as x > +00 the squared-distance (7 > +00. 


e Since the squared-distance is a polynomial it will not have any singular points, only 
critical points. The derivative is 


d 2 
pads — aes 2x —4x-—4 
L D(x 2)+ x x 


so the only critical point occurs at x = 1. 
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e When x = Ly = +,/2 and the distance is 
Pa(27F 211) =3 =V3 


and thus the minimum distance from the curve to (2,0) is V3. 


=== Example 3.5. 7) 


Example 3.5.18 


A water trough is to be constructed from a metal sheet of width 45 cm by bending up one 
third of the sheet on each side through an angle @. Which @ will allow the trough to carry 
the maximum amount of water? 


Solution. Clearly 0 < @ < 7, so we are back in the domain*? of Corollary 3.5.11. 


e After reading the problem carefully we should realise that it is really asking us to 
maximise the cross-sectional area. A figure really helps. 


e From this we are led to define the height / cm and cross-sectional area A cm*. Both 
are functions of 0. 


h = 15sin@ 


while the area can be computed as the sum of the central 15 x h rectangle, plus two 
triangles. Each triangle has height h and base 15 cos 6. Hence 


1 
A= 15h +2-5-15cosé 
= 15h (1+ cos @) 


e Since h = 15 sin @ we can rewrite the area as a function of just 0: 
A(@) = 225siné (1 + cos @) 


where 0 < 0 < 7. 


53 Again, no pun intended. 
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e Now we use Corollay 3.5.11. The ends of the interval give 


A(0) = 225sin0(1+ cos0) = 0 
A(z) = 225sin 7(1+cos 7) =0 


The derivative is 


A'(@) = 225cos@- (1+ cos@) + 225 sin 6 - (—sin@) 


= 225 [cos 6 + cos? 6 — sin? 6| recall sin? @ = 1— cos? 6 


= 225 [cos 6 + 2cos* 6 — 1| 


This is a continuous function, so there are no singular points. However we can still 
hunt for critical points by solving A’(@) = 0. That is 


2cos*@ + cos#—-1=0 factor carefully 
(2cos@ —1)(cos@+1) =0 


Hence we must have cos @ = —1 or cos@ = s. On the domain 0 < @ < 71, this means 
@=7/30r8 =—7. 


A(z) =0 
A(7/3) = 225sin(7/3) (1+ cos(7/3)) 
V3 1 
= 225.<. (1+5) 


= 225: =a x 292.28 


. . a 7U 
e Thus the cross-sectional area is maximised when 6 = 3° 


tC Example 35.18} J 
-_ Example 3.5.19 


Find the points on the ellipse = + y* = 1 that are nearest to and farthest from the point 
(1,0). 


Solution. While this is another distance problem, the possible values of x, y are bounded, 
so we need Corollary 3.5.11 rather than Theorem 3.5.15. 


e We start by drawing a picture: 
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e Let ¢ be the distance from the point (x,y) on the ellipse to the point (1,0). As was 
the case above, we will maximise the squared-distance. 


C= (x-1P +’. 
e Since (x,y) lie on the ellipse we have 


jy =4 
Note that this also shows that -2 <x <2and—l<y<l. 
Isolating y* and substituting this into our expression for (7 gives 
Seely ai— ef, 
ae 
= 
e Now we can apply Corollary 3.5.11. The endpoints of the domain give 


@(-2) = (-2-1)?+1-(-2)*/4=3?4+1-1=9 
Ta) mea (Aer Wee ee ag ee oa We eg 


The derivative is 


d Oo _ 3x 
KO =%x-1)-x/2=F-2 


Thus there are no singular points, but there is a critical point at x = 4/3. The corre- 
sponding squared-distance is 


4 2 (4/3)? 
(4/3) = (5 1) 1 r 


= (1/3)? +1 (4/9) =6/9 =2/3. 


e To summarise (and giving distances and coordinates of points): 
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x (x,y) £ 
9 | 220) 3 
4/3 | (4/3, +V5/3) | ./2/3 

2 (2,0) 1 


The point of maximum distance is (—2,0), and the point of minimum distance is 


(4/3, +v5/3). 
Example 35.19} J 


co Example 3.5.20 


Find the dimensions of the rectangle of largest area that can be inscribed in an equilateral 
triangle of side a if one side of the rectangle lies on the base of the triangle. 


Solution. Since the rectangle must sit inside the triangle, its dimensions are bounded and 
we will end up using Corollary 3.5.11. 


e Carefully draw a picture: 


(0, V3a/2) 


We have drawn (on the left) the triangle in the xy-plane with its base on the x-axis. 
The base has been drawn running from (—a/2,0) to (a/2,0) so its centre lies at the 
origin. A little Pythagoras (or a little trigonometry) tells us that the height of the 
triangle is 


V3 


a? — (a/2)2 = ~~. =a-sin 


Thus the vertex at the top of the triangle lies at (0, a a). 


e If we construct a rectangle that does not touch the sides of the triangle, then we can 
increase the dimensions of the rectangle until it touches the triangle and so make its 
area larger. Thus we can assume that the two top corners of the rectangle touch the 
triangle as drawn in the right-hand figure above. 
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e Now let the rectangle be 2x wide and y high. And let A denote its area. Clearly 
A = 2xy. 


where 0 <x <a/2and0<y< Bq. 

e Our construction means that the top-right corner of the rectangle will have coordi- 
nates (x,y) and lie on the line joining the top vertex of the triangle at (0, /3a/2) to 
the bottom-right vertex at (a/2,0). In order to write the area as a function of x alone, 
we need the equation for this line since it will tell us how to write y as a function of 
x. The line has slope 


V3a/2—-0 | 
i an v3. 


slope = 


and passes through the point (0, 3a/2), so any point (x,y) on that line satisfies: 


y= —~V/3x + 7 


e We can now write the area as a function of x alone 
Ax). = 2x (~v3x +e en) 
= V3x(a— 2x). 

with 0 < x <a/2. 

e The ends of the domain give: 
A(0) =0 Ala72) =0, 
The derivative is 
Al (x) = V3 (x-(—2)+1-(a—2x)) = V3(a— 4x). 


Since this is a polynomial there are no singular points, but there is a critical point at 


x = a/4. There 
a ae 
A(a/4) = V3- 7: (a@—a/2) = v3+ = 
y=—V3- (0/4) + Ba = v3.8, 


e Checking the question again, we see that we are asked for the dimensions rather 
than the area, so the answer is 2x x y: 


The largest such rectangle has dimensions 5 — 
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Campi 35.20/-J 


This next one is a good physics example. In it we will derive Snell’s Law™ from Fer- 
mat’s principle”’. : i 


Example 3.5.21 


Consider the figure below which shows the trajectory of a ray of light as it passes through 
two different mediums (say air and water). 


Let c, be the speed of light in air and c, be the speed of light in water. Fermat’s principle 
states that a ray of light will always travel along a path that minimises the time taken. So 
if a ray of light travels from P (in air) to Q (in water) then it will “choose” the point O (on 
the interface) so as to minimise the total time taken. Use this idea to show Snell’s law, 


sin 6, Cw 


where 6; is the angle of incidence and 6; is the angle of refraction (as illustrated in the 
figure above). 


Solution. This problem is a little more abstract than the others we have examined, but we 
can still apply Theorem 3.5.15. 


e Weare given a figure in the statement of the problem and it contains all the relevant 
points and angles. However it will simplify things if we decide on a coordinate sys- 
tem. Let’s assume that the point O lies on the x-axis, at coordinates (x,0). The point 
P then lies above the axis at (Xp, +Yp), while Q lies below the axis at (Xg,—YQ). 
This is drawn below. 


54 Snell’s law is named after the Dutch astronomer Willebrord Snellius who derived it in around 1621, d 


though it was first stated accurately in 984 by Ibn Sahl. 

55 Named after Pierre de Fermat who described it in a letter in 1662. The beginnings of the idea, however, 
go back as far as Hero of Alexandria in around 60CE. Hero is credited with many inventions including 
the first vending machine, and a precursor of the steam engine called an aeolipile. 
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e The statement of Snell’s law contains terms sin 6; and sin 6,, so it is a good idea for 
us to see how to express these in terms of the coordinates we have just introduced: 


ge opposite _ (x — Xp) 
hypotenuse J (Xp —x)2+¥2 
Xn 
sin6, = opposite _ (XQ — x) 


hypotenuse J (Xg—x)2+ 7 


e Let fp denote the distance PO, and fg denote the distance OQ. Then we have 


lp = | (Xp—x)? + ¥? 
la =4/(Xg—x)2+¥8 


If we then denote the total time taken by T, then 


lp lg 1 1 
PS 4 ea oa (Xx 24 34 Xg-x)?+Y¥2 
wt = cy (Xp — 2)? + YB + —4/ (Ko 2 + 5 


which is written as a function of x since all the other terms are constants. 


e Notice that as x — +co or x — —o the total time T — o and so we can apply 
Theorem 3.5.15. The derivative is 


dT 1 -2(Xp—x)  , 1 -2(Xg—x) 
dx ©024/(Xp—x)2 + ¥3 Cw 2,/(Xq—x)? + ¥2 


Notice that the terms inside the square-roots cannot be zero or negative since they 
are both sums of squares and Yp, Yg > 0. So there are no singular points, but there 
is a critical point when T’(x) = 0, namely when 


_ 1 Xp-—x ir AG=-2 
Co/(Xp—x)2+Y¥% %#4/(Xq— x)? +¥} 
— sin 6; sin 0, 

Ca | Cw 
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Rearrange this to get 


sin6;  sin@, : . 
= move sines to one side 


Ca Cw 


sin@, Cw 


which is exactly Snell’s law. 


Cape 35.21} 
(— Example 3.5.22 


The Statue of Liberty has height 46m and stands on a 47m tall pedestal. How far from the 
statue should an observer stand to maximize the angle subtended by the statue at the 
observer’s eye, which is 1.5m above the base of the pedestal? 


Solution. Obviously if we stand too close then all the observer sees is the pedestal, while 
if they stand too far then everything is tiny. The best spot for taking a photograph is 
somewhere in between. 


e Draw a careful picture”® 


and we can put in the relevant lengths and angles. 


e The height of the statue is h = 46m, and the height of the pedestal (above the eye) is 
p = 47 —1.5 = 45.5m. The horizontal distance from the statue to the eye is x. There 
are two relevant angles. First 0 is the angle subtended by the statue, while ¢@ is the 
angle subtended by the portion of the pedestal above the eye. 


56 And make some healthy use of public domain clip art. d 
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e Some trigonometry gives us 


P 
tang =+ 
an ~ - 
h 
tan(g + 0) = ei 
x 
Thus 
_ iss 
gy = arctan - 
h 
gy + @ = arctan ae 
x 
and so 
h 
6 = arctan pes — arctan a 
x x 


If we allow the viewer to stand at any point in front of the statue, then 0 < x < ~. 
Further observe that as x — « or x — 0 the angle @ — 0, since 


h 7 

lim arctan a = 
x- x 2 
na 

lim arctan aera 

x- x 2. 


Clearly the largest value of @ will be strictly positive and so has to be taken for some 
0 < x < ow. (Note the strict inequalities.) This x will be a local maximum as well as 
a global maximum. As @ is not singular at any 0 < x < «, we need only search for 
critical points. A careful application of the chain rule shows that the derivative is 


Gee. 1 (th) 1 (2) 

dx spas (a x2 14+ (2)2 \x 
APF) go? 

x24 (p+h)2 ' x24 p? 


So a critical point occurs when 


(p+h) p : 
x2 + (p+hy one p2 cross multiply 


(p +h)(x* + p?) = p(x? + (p+h)’) collect x terms 
x°(p+h—p) = p(p+h)*—p(p +h) clean up 
hx? = p(pt+h)(p+h-—p)=ph(pt+h) cancel common factors 
x? = p(p +h) 
x =4)/p(p+h) ~ +64.9m 
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e Thus the best place to stand approximately 64.9m in front or behind the statue. At 
that point @ ~ 0.348 radians or 19.9°. 


Cs ______scs Example 35221 


Example 3.5.23 


Find the length of the longest rod that can be carried horizontally (no tilting allowed) from 
a corridor 3m wide into a corridor 2m wide. The two corridors are perpendicular to each 
other. 


Solution. 
e Suppose that we are carrying the rod around the corner, then if the rod is as long as 


possible it must touch the corner and the outside walls of both corridors. A picture 
of this is show below. 


You can see that this gives rise to two similar triangles, one inside each corridor. 
Also the maximum length of the rod changes with the angle it makes with the walls 
of the corridor. 


Suppose that the angle between the rod and the inner wall of the 3m corridor is 0, 
as illustrated in the figure above. At the same time it will make an angle of 5 — 6 
with the outer wall of the 2m corridor. Denote by ¢)(@) the length of the part of 
the rod forming the hypotenuse of the upper triangle in the figure above. Similarly, 
denote by £2(@) the length of the part of the rod forming the hypotenuse of the lower 
triangle in the figure above. Then 


and the total length is 


3 2 
sin@ cosé@ 


(8) = (0) + &(@) = 


where 0 < @ < §. 
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e The length of the longest rod we can move through the corridor in this way is the 
minimum of ¢(@). Notice that (6) is not defined at 6 = 0,4. Indeed we find that 
as 0 > OT or 0 > 4, the length € — +00. (You should be able to picture what 
happens to our rod in those two limits). Clearly the minimum allowed ¢(@) is going 
to be finite and will be achieved for some 0 < 6 < 4 (note the strict inequalities) and 
so will be a local minimum as well as a global minimum. So we only need to find 
zeroes of ¢’(@). Differentiating @ gives 


dl 3cos@  2sind _ —3cos? 6 + 2sin’ @ 


6° sin2o ' coseé. sin? 6 cos2 6 


This does not exist at @ = 0, $ (which we have already analysed) but does exist at 
every 0 < 6 < § and is equal to zero when the numerator is zero. Namely when 


2 sin? 6 = 3cos° 6 divide by cos* 6 
2tan® @ = 3 


3 
£. Sf: 
tno = 2 


e From this we can recover sin@ and cos6, without having to compute @ itself. We can, 
for example, construct a right-angle triangle with adjacent length \/2 and opposite 
length W3 (so that tan@ = 3/2): 


/9?/3 + 37/3 ‘es 


31/3 
sin@ = 
2/34 92/3 
21/3 
cos @ = 
2/3 4.92/3 
Alternatively could use the identities: 
1+ tan* 6 = sec” 1+ cot? @ = csc” 


to obtain expressions for 1/ cos @ and 1/ sin @. 
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e Using the above expressions for sin 0, cos @ we find the minimum of ¢ (which is the 
longest rod that we can move): 


= 3 : 2 = 3 2 
sin@  cos@ _ 3 2 


4/ 27/3 432/3 4/ 22/3 4.32/3 
= 2/3 + 37/3[3%9 4 27/9) 


= [2 +379]? ~ 7.02m 


i pW U OQ Example 35.23; J 


3.6 4 Sketching graphs 


One of the most obvious applications of derivatives is to help us understand the shape 
of the graph of a function. In this section we will use our accumulated knowledge of 
derivatives to identify the most important qualitative features of graphs y = f(x). The 
goal of this section is to higlight features of the graph y = f(x) that are easily 


e determined from f(x) itself, and 
e deduced from f’(x), and 
e read from f” (x). 


We will then use the ideas to sketch several examples. 


3.6.1 » Domain, intercepts and asymptotes 


Given a function f(x), there are several important features that we can determine from 
that expression before examining its derivatives. 


e The domain of the function — take note of values where f does not exist. If the 
function is rational, look for where the denominator is zero. Similarly be careful to 
look for roots of negative numbers or other possible sources of discontinuities. 


e Intercepts — examine where the function crosses the x-axis and the y-axis by solving 
f(x) = 0 and computing f(0). 


e Vertical asymptotes — look for values of x at which f(x) blows up. If f(x) ap- 
proaches either +o or —co as x approaches a (or possibly as x approaches a from 
one side) then x = a is a vertical asymptote to y = f(x). When f(x) is a rational 
function (written so that common factors are cancelled), then y = f(x) has vertical 
asymptotes at the zeroes of the denominator. 


e Horizontal asymptotes — examine the limits of f(x) as x > +o and x > —o. Often 
f(x) will tend to +o or to —~ or to a finite limit L. If, for example, lim fap, 
X70 


then y = L is a horizontal asymptote to y = f(x) as x > o. 
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im Example a 


Consider the function 


xa 1 
a= 
fe) (x +3)(x—2) 
e We see that it is defined on all real numbers except x = —3, +2. 


e Since f(0) = —1/6 and f(x) = 0 only when x = ~—1, the graph has y-intercept 
(0,—1/6) and x-intercept (—1,0). 


e Since the function is rational and its denominator is zero at x = —3,+2 it will have 
vertical asymptotes at x = —3, +2. To determine the shape around those asymptotes 
we need to examine the limits 


lim_f (x) lim f (x) 


x23 x2 


Notice that when x is close to —3, the factors (x + 1) and (x — 2) are both negative, 


so the sign of f(x) = £45 - 4, is the same as the sign of x + 3. Hence 
li = li 1) == 
im f(x) = +0 _im_ f(x) 


A similar analysis when x is near 2 gives 


lim f(x) = +00 lim f(x) = —0o 


x2t x>2- 


e Finally since the numerator has degree 1 and the denominator has degree 2, we see 
that as x > +0, f(x) > 0. So y = 0is a horizontal asymptote. 


e Since we know the behaviour around the asymptotes and we know the locations of 
the intercepts (as shown in the left graph below), we can then join up the pieces and 
smooth them out to get the a good sketch of this function (below right). 


Cape 361, J 


299 


APPLICATIONS OF DERIVATIVES 3.6 SKETCHING GRAPHS 


3.6.2 » First derivative — increasing or decreasing 


Now we move on to the first derivative, f’(x). This is a good time to revisit the mean-value 
theorem (Theorem 2.13.4) and some of its consequences (Corollary 2.13.11). In particular, 
let us assume that f(x) is continuous on an interval [A,B] and differentiable on (A,B). 
Then 


e if f’(x) > 0 forall A < x < B, then f(x) is increasing on (A, B) 
— that is, for all A<a<b<B, f(a) < f(b). 


e if f’(x) <0 forall A < x < B, then f(x) is decreasing on (A, B) 
— that is, for all A<a<b<B, f(a) > f(b). 


Thus the sign of the derivative indicates to us whether the function is increasing or de- 
creasing. Further, as we discussed in Section 3.5.1, we should also examine points at 
which the derivative is zero — critical points — and where the derivative does not exist 
— singular points. These points may indicate a local maximum or minimum. 

After studying the function f(x) as described above, we should compute its derivative 


f(x). 


e Critical points — determine where f’(x) = 0. Ata critical point, f has a horizontal 
tangent. 


e Singular points — determine where f’(x) is not defined. If f’(x) approaches + as x 
approaches a singular point a, then f has a vertical tangent there when f approaches 
a finite value as x approaches a (or possibly approaches a from one side) and a verti- 
cal asymptote when f(x) approaches +0 as x approaches a (or possibly approaches 
a from one side). 


e Increasing and decreasing — where is the derivative positive and where is it neg- 
ative. Notice that in order for the derivative to change sign, it must either pass 
through zero (a critical point) or have a singular point. Thus neighbouring regions 
of increase and decrease will be separated by critical and singular points. 


co Example 3.6.2 —_—_—S 


Consider the function 


Fe) = 6" 


e Before we move on to derivatives, let us first examine the function itself as we did 
above. 


- As f(x) isa polynomial its domain is all real numbers. 


— Its y-intercept is at (0,0). We find its x-intercepts by factoring 
fx) Se =6e = 2° (4—5) 
So it crosses the x-axis at x = 0,6. 
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- Again, since the function is a polynomial it does not have any vertical asymp- 
totes. And since 


lim f(x) = tim "(1 —6/x) = +00 


x— +00 


it does not have horizontal asymptotes — it blows up to +o as x goes to +0. 


— We can also determine where the function is positive or negative since we know 
it is continuous everywhere and zero at x = 0,6. Thus we must examine the 
intervals 


(—00,0) (0,6) (6,0) 


When x < 0, x° < Oand x—6 < 0s0 f(x) = x°(x—6) = (negative) (negative) > 
0. Similarly when x > 6, x? > 0,x — 6 > 0 we must have f(x) > 0. Finally when 
0 <x <6,x° > 0 but x —6 < 0s0 f(x) <0. Thus 


interval | (—2%,0) || 0 || (0,6) 6 || (6,0) 
f(x) | positive | 0 || negative || 0 || positive 


— Based on this information we can already construct a rough sketch. 


positive » negative positive 


e Now we compute its derivative 


f (x) She 18x" = 2x7-(2x—9) 


e Since the function is a polynomial, it does not have any singular points, but it does 
have two critical points at x = 0,9/2. These two critical points split the real line into 
3 open intervals 


(—00,0) (0,9/2) (9/2,0) 
We need to determine the sign of the derivative in each intervals. 


— When x < 0, x? > 0 but (2x —9) < 0,80 f(x) < Oand the function is decreasing. 


— When 0 < x < 9/2, x* > O but (2x —9) < 0,80 f’(x) < Oand the function is still 
decreasing. 


— When x > 9/2, x? > 0 and (2x —9) > 0, so f’(x) > 0 and the function is 
increasing. 
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We can then summarise this in the following table 


3.6 SKETCHING GRAPHS 


interval | (—0,0) 0 (0,9/2) 9/2 (972;<0) 
Px) negative 0 negative 0 positive 
: horizontal : = ; : 
decreasing decreasing || minimum |) increasing 
tangent 


Since the derivative changes sign from negative to positive at the critical point x = 
9/2, this point is a minimum. Its y-value is 


93 /9 
y= f09/2)= 55 (5-8) 
8 f=3)\... & 
Oe Da 


On the other hand, at x = 0 the derivative does not change sign; while this point has 
a horizontal tangent line it is not a minimum or maximum. 


e Putting this information together we arrive at a quite reasonable sketch. 


(3 -%) 
Q) 94 
bd 


: increase 


decrease | decrease 


To improve upon this further we will examine the second derivative. 


Capi 362} J 


3.6.3 »» Second derivative — concavity 


The second derivative f” (x) tells us the rate at which the derivative changes. Perhaps the 
easiest way to understand how to interpret the sign of the second derivative is to think 
about what it implies about the slope of the tangent line to the graph of the function. 
Consider the following sketches of y = 1+ x? and y = —1— x’. 
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e In the case of y = f(x) = 1+x*, f”(x) = 2 > 0. Notice that this means the slope, 
f'(x), of the line tangent to the graph at x increases as x increases. Looking at the 
figure on the left above, we see that the graph always lies above the tangent lines. 


e For y = f(x) = -1—x?, f"(x) = —2 < 0. The slope, f’(x), of the line tangent to the 


graph at x decreases as x increases. Looking at the figure on the right above, we see 
that the graph always lies below the tangent lines. 


Similarly consider the following sketches of y = x~!/? and y = V4— x: 


1 


Both of their derivatives, —5x~°/? and —4(4 —x)~!/2 are negative, so they are decreasing 
functions. Examining second derivatives shows some differences. 


e For the first function, y"(x) = 3x~°/ > 0, so the slopes of tangent lines are increas- 


ing with x and the graph lies above its tangent lines. 


e However, the second function has y"(x) = —}(4—.x)~3/ < 0 so the slopes of the 


tangent lines are decreasing with x and the graph lies below its tangent lines. 


More generally 
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Let f(x) be a continuous function on the interval [a,b] and suppose its first and 
second derivatives exist on that interval. 


e If f”(x) > 0 for all a < x < b, then the graph of f lies above its tangent lines 
for a < x < band it is said to be concave up. 


e If f’(x) < 0 for alla < x <b, then the graph of f lies below its tangent lines 
for a < x < band it is said to be concave down. 


e If f”(c) =0 for some a < c < b,and the concavity of f changes across x = c, 
then we call (c, f(c)) an inflection point. 


concave 
down 
(e,f(c)) 
™~ inflection 
concave point 
up 


Note that one might also see the terms 

e “convex” or “convex up” used in place of “concave up”, and 

e “concave” or “convex down” used to mean “concave down”. 
To avoid confusion we recommend the reader stick with the terms “concave up” and 
“concave down”. 


Let’s now continue Example 3.6.2 by discussing the concavity of the curve. 
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a Example 3.6.4 (Continuation of Example 3.6.2) a | 


Consider again the function 
fae =60 
e Its first derivative is f’(x) = 4x3 — 18x?, so 


f"'(x) = 12x? — 36x = 12x(x —3) 


e Thus the second derivative is zero (and potentially changes sign) at x = 0,3. Thus 
we should consider the sign of the second derivative on the following intervals 


(—00,0) (0,3) (3/20) 
A little algebra gives us 
interval | (—«,0) 0 (0,3) 3 (3, 00) 
f"(x) | positive 0 negative 0 positive 
concavity up inflection || down _ || inflection up 


Since the concavity changes at both x = 0 and x = 3, the following are inflection 
points 


(0,0) (3,3*-—6« 3") = (3,-3°) 


e Putting this together with the information we obtained earlier gives us the following 
sketch 


concave Y 
up : down 


concave concave 


up 


Campi 364, 
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3.6.4 » Symmetries 


Before we proceed to some examples, we should examine some simple symmetries pos- 
sessed by some functions. We’ll look at three symmetries — evenness, oddness and peri- 
odicity. If a function possesses one of these symmetries then it can be exploited to reduce 
the amount of work required to sketch the graph of the function. 

Let us start with even and odd functions. 


A function f(x) is said to be even if f(—x) = f(x) for all x. 


A function f(x) is said to be odd if f(—x) = —f(x) for all x. 


Example 3.6.7 
Let f(x) = x* and g(x) = x°. Then 


Hence f(x) is even and 9(x) is odd. 
Notice any polynomial involving only even powers of x will be even 
F(x) S7e 42x82 3x*4-5 remember that 5 = 5x° 
f(-x) = 7(=x)° + 2a)" = 3(—x)" +5 
= Jae 95? — 3 a F(a) 


Similarly any polynomial involving only odd powers of x will be odd 


e(x) = 2a" = 6x" — 3x 
g(=*) = 2(=a)? 8 (=) =3 (=) 
= —2x° + 8x9 + 3x = —g(x) 


Cape 367, JI 


Not all even and odd functions are polynomials. For example 


|x| cos x and (e* + e*) 
are all even, while 


sin x tan x and (e* —e*) 
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are all odd. Indeed, given any function f(x), the function 


g(x) = f(x) + f(—x) will be even, and 
h(x) = f(x) — f(—x) will be odd. 


Now let us see how we can make use of these symmetries to make graph sketching 
easier. Let f(x) be an even function. Then 


the point (xo, yo) lies on the graph of y = f(x) 


if and only if yo = f (xo) = f(—xo) which is the case if and only if 


the point (—x9, yo) lies on the graph of y = f(x). 


Notice that the points (xo, yo) and (—x0, yo) are just reflections of each other across the 
y-axis. Consequently, to draw the graph y = f(x), it suffices to draw the part of the graph 
with x > 0 and then reflect it in the y—axis. Here is an example. The part with x > 0 is on 
the left and the full graph is on the right. 


Very similarly, when f(x) is an odd function then 
(xo, yo) lies on the graph of y = f(x) 
if and only if 


(—xo, —Yo) lies on the graph of y = f(x) 
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Now the symmetry is a little harder to interpret pictorially. To get from (x9, Yo) to (—xo, —Yo) 
one can first reflect (x9, yo) in the y-axis to get to (—xo, yo) and then reflect the result in 
the x-axis to get to (—x9, —yo). Consequently, to draw the graph y = f(x), it suffices to 
draw the part of the graph with x > 0 and then reflect it first in the y—axis and then in the 
x-axis. Here is an example. First, here is the part of the graph with x > 0. 


Next, as an intermediate step (usually done in our heads rather than on paper), we add in 
the reflection in the y-axis. 


Finally to get the full graph, we reflect the dashed line in the x-axis 
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and then remove the dashed line. 


Let’s do a more substantial example of an even function 


om Example Sa 7 


Consider the function 


e The function is even since 


doe 42. 
g(-x) = a = a = (x) 


Thus it suffices to study the function for x > 0 because we can then use the even 
symmetry to understand what happens for x < 0. 


e The function is defined on all real numbers since its denominator x2 + 3 is never 
zero. Hence it has no vertical asymptotes. 


e The y-intercept is ¢(0) = = = —3. And x-intercepts are given by the solution 
of x79 = 0, namely x = +3. Note that we only need to establish x = 3 as an 
intercept. Then since g is even, we know that x = —3 is also an intercept. 
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e To find the horizontal asymptotes we compute the limit as x — ++00 


pg 8) = HP T3 


 x7(1-—9/x7) 
= lim =~; 
x0 x2(1+4+ 3/x2) 
_. 1-9/x? 
_ jim 143/x2 : 
Thus y = 1 is a horizontal asymptote. Indeed, this is also the asymptote as x — —co 
since by the even symmetry 
fim g(x) = lim g(—x) = lim g(x). 
e We can already produce a quite reasonable sketch just by putting in the horizontal 
asymptote and the intercepts and drawing a smooth curve between them. 


even 
symmetry 


Note that we have drawn the function as never crossing the asymptote y = 1, how- 
ever we have not yet proved that. We could by trying to solve g(x) = 1. 


x79 
x243 
x79 = 7743 


—9 =3s80 no solutions. 


Alternatively we could analyse the first derivative to see how the function approaches 
the asymptote. 


e Now we turn to the first derivative: 
ei (x* +3) (2x) — (x* — 9) (2x) 
oe = (x2 +3)2 
24x 
~ (x24+3)2 
There are no singular points since the denominator is nowhere zero. The only critical 
point is at x = 0. Thus we must find the sign of ¢’(x) on the intervals 


(—o0,0) (0, 00) 
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e When x > 0,24x > 0 and (x7 +3) > 0,so@/(x) > O and the function is increasing. By 
even symmetry we know that when x < 0 the function must be decreasing. Hence 
the critical point x = 0 is a local minimum of the function. 


e Notice that since the function is increasing for x > 0 and the function must approach 
the horizontal asymptote y = 1 from below. Thus the sketch above is quite accurate. 


e Now consider the second derivative: 


Aa) — di dx 
Bd ax G2 +3)2 
2 a2. - : 2 ; 
_ (x2 +3) so +3) 2x cancel a factor of (x? + 3) 
(x2 +3) -24 — 96x? 
(3-432 
_ 72(1— x?) 


e It is clear that ¢”(x) = 0 when x = +1. Note that, again, we can infer the zero at 
x = —1 from the zero at x = 1 by the even symmetry. Thus we need to examine the 
sign of ¢’(x) the intervals 


(—o0, -1) (—1,1) (1, 00) 


e When |x| < 1 we have (1 — x?) > 0 so that ¢”(x) > 0 and the function is concave 
up. When |x| > 1 we have (1 —x*) < 0 so that ¢”(x) < 0 and the function is 
concave down. Thus the points x = +1 are inflection points. Their coordinates are 


(+1,9(+1)) = (+1,-2). 


e Putting this together gives the following sketch: 


Sipe \ | [02 


concave ‘concave’ concave 
down up down 


Cape 368} 


Another symmetry we should consider is periodicity. 
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A function f(x) is said to be periodic, with period P > 0, if f(x + P) = f(x) for 


all x. 


Note that if f(x + P) = f(x) for all x, then replacing x by x + P, we have 
Pete) jie PPS 7a) = as 


More generally f(x -+kP) = f(x) for all integers k. Thus if f has period P, then it also 
has period nP for all natural numbers n. The smallest period is called the fundamental 
period. 


Example 3.6.10 


The classic example of a periodic function is f(x) = sinx, which has period 27 since 
f(x +27) = sin(x + 27) = sinx = f(x). 


Ce inpie 3,610 


If f(x) has period P then 


(xo, Yo) lies on the graph of y = f(x) 
if and only if yo = f(xo) = f (xo + P) which is the case if and only if 
(xo + P, yo) lies on the graph of y = f(x) 
and, more generally, 
(xo, Yo) lies on the graph of y = f(x) 
if and only if 
(xo + nP, yo) lies on the graph of y = f (x) 


for all integers n. 

Note that the point (xp + P, yo) can be obtained by translating (xo, yo) horizontally 
by P. Similarly the point (x9 + nP,yo) can be found by repeatedly translating (x9, yo) 
horizontally by P. 


Consequently, to draw the graph y = f(x), it suffices to draw one period of the graph, say 
the part with 0 < x < P, and then translate it repeatedly. Here is an example. Here is a 
sketch of one period 
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and here is the full sketch. 


(xo+P,yo) 


3.6.5 » A checklist for sketching 


Above we have described how we can use our accumulated knowledge of derivatives to 
quickly identify the most important qualitative features of graphs y = f(x). Here we give 
the reader a quick checklist of things to examine in order to produce an accurate sketch 
based on properties that are easily read off from f(x), f’(x) and f” (x). 


>>> A sketching checklist. 
(1) Features of y = f(x) that are read off of f(x): 


e First check where f(x) is defined. Then 

e y=f(x) is plotted only for x’s in the domain of f(x), i.e. where f(x) is defined. 
e y = f(x) has vertical asymptotes at the points where f(x) blows up to +0. 

e Next determine whether the function is even, odd, or periodic. 


e y = f(x) is first plotted for x > 0 if the function is even or odd. The rest of the 
sketch is then created by reflections. 


e y = f(x) is first plotted for a single period if the function is periodic. The rest of 
the sketch is then created by translations. 


e Next compute f(0), limy+o f(x) and lim;.—« f(x) and look for solutions to 


f(x) =0 that you can easily find. Then 
e y = f(x) has y-intercept (0, f(0)). 
e y = f(x) has x-intercept (a,0) whenever f(a) = 0 
e y = f(x) has horizontal asymptote y = Y if limyo f(x) = L or limy—_w f(x) = 
by 
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(2) Features of y = f(x) that are read off of f’(x): 


e Compute f’(x) and determine its critical points and singular points, then 

e y = f(x) hasa horizontal tangent at the points where f’(x) = 0. 

e y = f(x) is increasing at points where f’(x) > 0. 

e y = f(x) is decreasing at points where f’(x) < 0. 

e y = f(x) has vertical tangents or vertical asymptotes at the points where f’(x) = 
00. 


(3) Features of y = f(x) that are read off of f” (x): 


e Compute f”(x) and determine where f”(x) = 0 or does not exist, then 
e y = f(x) is concave up at points where f”(x) > 0. 
e y = f(x) is concave down at points where f”(x) < 0. 


e y = f(x) may or may not have inflection points where f”(x) = 0. 


3.6.6 » Sketching examples 


i Example 3.6.11 (Sketch f(x) = x3 — 3x +1) —-—Sa 


(1) Reading from f (x): 


e The function is a polynomial so it is defined everywhere. 


e Since f(—x) = —x°++3x+1#4 +f(x), it is not even or odd. Nor is it periodic. 


e The y-intercept is y = 1. The x-intercepts are not easily computed since it is 
a cubic polynomial that does not factor nicely””. . So for this example we don’t 
worry about finding them. 


e Since it is a polynomial it has no vertical asymptotes. 


e For very large x, both positive and negative, the x° term in f(x) dominates the 
other two terms so that 


+0 aSx—>+0 


—-CO aSx > —@& 


and there are no horizontal asymptotes. 


(2) We now compute the derivative: 


f' @) =3e —3=36 - 1) =364+ D@=1) 


57 With the aid of a computer we can find the x-intercepts numerically: x + —1.879385242, 0.3472963553, 
and 1.532088886. If you are interested in more details then try googling “Newton’s method” or “root 
finding”. 
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e The critical points (where f’(x) = 0) are at x = +1. Further since the derivative 
is a polynomial it is defined everywhere and there are no singular points. The 
critical points split the real line into the intervals (—9, —1), (—1,1) and (1, 0). 


e When x < —1, both factors (x + 1), (x —1) < 0so f’(x) > 0. 
e Similarly when x > 1, both factors (x + 1),(x—1) > 0so f(x) > 0. 
e When -1 < x <1, (x—1) < Obut (x +1) > Oso f’(x) <0. 


e Summarising all this 


(—o0, —1) -1 (-1,1) il (1,00) 
f'(x) | positive 0 negative 0 positive 
increasing || maximum decreasing minimum || increasing 


So (—1, f(—1)) = (—1,3) is a local maximum and (1, f(1)) = (1,1) is a local 
minimum. 


(3) Compute the second derivative: 
f"(x) = 6x 


e The second derivative is zero when x = 0, and the problem is quite easy to anal- 
yse. Clearly, f”(x) <0 when x < Oand f”(x) > 0 when x > 0. 


e Thus f is concave down for x < 0, concave up for x > 0 and has an inflection 
point at x = 0. 


Putting this all together gives: 


y=2?—3r4+1 


f'>0, f increasing f'<0, f decreasing f'>0, f increasing 


f'’<0, f convex down f'’>0, f convex up 


Campi 3.6.11 =) 
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cs Example 3.6.12 (Sketch f(x) = x*— 4x°) | 


(1) Reading from f(x): 


e The function is a polynomial so it is defined everywhere. 
e Since f(—x) = x4 +4x° 4 +f (x), it is not even or odd. Nor is it periodic. 
e The y-intercept is y = f(0) = 0, while the x-intercepts are given by the solution 
of 
{asa =0 
evs 4) =0 
Hence the x-intercepts are 0, 4. 


e Since f is a polynomial it does not have any vertical asymptotes. 


e For very large x, both positive and negative, the x* term in f(x) dominates the 
other term so that 


+o aSx — +00 


TO aSx > —& 


f(x) ~| 


and the function has no horizontal asymptotes. 
(2) Now compute the derivative f’(x): 
f SA S12 = 4 = 3) 


e The critical points are at x = 0,3. Since the function is a polynomial there are no 
singular points. The critical points split the real line into the intervals (—<«,0), 
(0,3) and (3,0). 


When x < 0, x? > 0 and x —3 < 0,80 f(x) <0. 
e When 0 < x <3,x7 > Oand x—3 <0,s0 f’(x) <0. 
e When 3 < x, x* > Oand x —3 > 0,80 f’(x) > 0. 


e Summarising all this 


(—o0,0) 0 (0,3) 3 (3,0) 
f'(x) | negative 0 negative 0 positive 
. horizontal ; hy : ; 
decreasing tangent decreasing || minimum |) increasing 


So the point (3, f(3)) = (3, —27) is a local minimum. The point (0, f(0)) = (0,0) 
is neither a minimum nor a maximum, even though f’(0) = 0. 


(3) Now examine f” (x): 


f" (x) S 12x? — 24x = 12x + 2) 
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e So f”(x) = O0when x = 0,2. This splits the real line into the intervals (—00, 0), (0,2) 
and (2,00). 

e When x < 0, x —2 < Oand so f”(x) > 0. 

e When 0 < x <2,x > Oand x—2 < Oand so f"(x) <0. 

e When 2 < x,x > Oand x —2>0andso f”(x) > 0. 


e Thus the function is convex up for x < 0, then convex down for 0 < x < 
and finally convex up again for x > 2. Hence (0, f(0)) = (0,0) and (2, f(2)) = 
(2, —-16) are inflection points. 


~ 


Putting all this information together gives us the following sketch. 


(3, ~27) 


f'<0, f decreasing f'<0, f decreasing f'>0, f increasing 


f">0, f convex up f’’<0, convex down f">0, f convex up 


tC Example 36.12} J 
i Example 3.6.13 (f(x) = x3 — 6x? + 9x — 54) ——— 4 


(1) Reading from f(x): 


e The function is a polynomial so it is defined everywhere. 


e Since f(—x) = —x> — 6x? —9x —54 4 +f(x), it is not even or odd. Nor is it 
periodic. 


e The y-intercept is y = f(0) = —54, while the x-intercepts are given by the solution 
of 


f(x) =x — 6x? +9x —54=0 
x*(x —6)+9(x—6) =0 
(x2 +9)(x —6) =0 


Hence the only x-intercept is 6. 
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e Since f is a polynomial it does not have any vertical asymptotes. 


e For very large x, both positive and negative, the x° term in f(x) dominates the 
other term so that 


+0 aSx—>+0 
f(x) ~| 
—-0O asSx—>-— 


and the function has no horizontal asymptotes. 
(2) Now compute the derivative f’(x): 


f'(x) = 3x7 -12x4+9 
= 3(x7 — 4x43) = 3(x—3)(x-1) 
e The critical points are at x = 1,3. Since the function is a polynomial there are no 


singular points. The critical points split the real line into the intervals (—~,1), 
(1,3) and (3,0). 


e When x < 1, (x —1) < Oand (x — 3) < 0,80 f’(x) > 0. 
e When 1 < x <3, (x —1) > Oand (x —3) < 0,80 f’(x) <0. 
When 3 < x, (x — 1) > Oand (x —3) > 0,so f’(x) > 0. 


e Summarising all this 


(—2, 1) 1 (1,3) 3 (3,00) 
f'(x) | positive 0 negative 0 positive 
increasing || maximum decreasing minimum || increasing 


So the point (1, f(1)) = (1,—50) is a local maximum. The point (3, f(3)) = 
(3, —54) is a local minimum. 


(3) Now examine f” (x): 
f" (x) = 6x -12 


e So f”(x) = 0 when x = 2. This splits the real line into the intervals (—<0,2) and 
(2,00). 

e When x < 2, f"(x) <0. 

e When x > 2, f”(x) > 0. 

e Thus the function is convex down for x < 2, then convex up for x > 2. Hence 
(2, f(2)) = (2, —52) is an inflection point. 


Putting all this information together gives us the following sketch. 
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y = x? — 62? + Ox — 54 


and if we zoom in around the interesting points (minimum, maximum and inflection 
point), we have 


re (3,—54) 


f'>0 f'<0 f'>0 
increasin decreasin, increasing 
g g 


f’<0, f convex down f">0, convex up 


——————EEEE Example 3.6.13} J 


An example of sketching a simple rational function. 


cq Example 3.6.14 ¢ (x) = =) =F 


(1) Reading from f(x): 


e The function is rational so it is defined except where its denominator is zero — 
namely at x = +2. 
—Xx - 
e Since f(—x) = = —f (x), itis odd. Indeed this means that we only need 
to examine what happens to the function for x > 0 and we can then infer what 
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happens for x < 0 using f(—x) = —f(x). In practice we will sketch the graph for 
x > 0 and then infer the rest from this symmetry. 

e The y-intercept is y = f(0) = 0, while the x-intercepts are given by the solution 
of f(x) = 0. So the only x-intercept is 0. 


e Since f is rational, it may have vertical asymptotes where its denominator is zero 
—at x = +2. Since the function is odd, we only have to analyse the asymptote at 


x = 2 and we can then infer what happens at x = —2 by symmetry. 
lim f(x) = lim . = +00 
x2 = x32t (x = 2) x 2) — 
x 
] ae — 
i Ee 


e We now check for horizonal asymptotes: 


im f(@) ~ a x2 — 4 

1 

mf =0 
antes, x—4/x 


(2) Now compute the derivative f’(x): 
ry (x2 -4)-1-x-2x 
f (x) *+ (x2 a 4)2 
_ —(x? +4) 
mcm: 


e Hence there are no critical points. There are singular points where the denomi- 
nator is zero, namley x = +2. Before we proceed, notice that the numerator is 
always negative and the denominator is always positive. Hence f’(x) < 0 except 
at x = +2 where it is undefined. 


e The function is decreasing except at x = +2. 


e We already know that at x = 2 we have a vertical asymptote and that f’(x) < 0 


for all x. So 
lim f’(x) = —o 
x2 
e Summarising all this 
[0,2) 2 (260) 
f'(x) | negative DNE negative 
; vertical } 
decreasing asymptote decreasing 


Remember — we will draw the graph for x > 0 and then use the odd symmetry 
to infer the graph for x < 0. 
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(3) Now examine f” (x): 
2 42 ” y 
f(x) = (x* — 4)*- (2x) 5 ~ 2-2x-(x* —4) 
(x? — 4) - (2x) — (x2 +4) -4x 
G24) 
2x3 — 8x — 4x3 — 16x 
Gray 

__ 2x(x? + 12) 
~ G24 


e So f”(x) = 0 when x = 0 and does not exist when x = +2. This splits the real 
line into the intervals (—0, —2), (—2,0), (0,2) and (2, 00). However we only need 
to consider x > 0 (because of the odd symmetry). 


e When 0 < x < 2,x >0,(x7 +12) > Oand (x? —4) < 080 f"(x) <0. 
e When x > 2, x > 0,(x* +12) > Oand (x*— 4) > 0so f"(x) > 0. 


Putting all this information together gives the following sketch for x > 0: 


f’>0 
convex up 


We can then draw in the graph for x < 0 using f(—x) = —f (x): 


Wi 2 
VY = Waa 


inflection point 


f">0 
convex up 


Notice that this means that the concavity changes at x = 0, so the point (0, f(0)) = (0,0) 
is a point of inflection (as indicated). 
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tC Example 3.6.14 __J 


This final example is more substantial since the function has singular points (points 
where the derivative is undefined). The analysis is more involved. 


on Example 3.6.15 (f(x) = ¢/ Gap ){_  m———__—— 


(1) Reading from f(x): 


e First notice that we can rewrite 


, x2 3 x 3 1 
fe) = (Leos “Vena VG =67xP 


e The function is the cube root of a rational function. The rational function is de- 
fined except at x = 6, so the domain of f is all reals except x = 6. 


e Clearly the function is not periodic, and examining 


" 1 
ae Va —6/(-x))2 


C ery Pele) 


3 


shows the function is neither even nor odd. 


e Tocompute horizonal asymptotes we examine the limit of the portion of the func- 
tion inside the cube-root 


This means we have 


lim f(x) =1 


X— -£00 


That is, the line y = 1 will be a horizontal asymptote to the graph y = f(x) both 
for x — +-oo and for x — —oo. 


e Our function f(x) — +co as x > 6, because of the (1 — 6/x) in its denominator. 
So y = f(x) has x = 6asa vertical asymptote. 


(2) Now compute f’(x). Since we rewrote 


fx) =Vasenp= 0-£)" 
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we can use the chain rule 


I| 

oh 

fs 
R 

R| | 
oO 

NY 
ss 
Ww 

RB) 


e Notice that the derivative is nowhere equal to zero, so the function has no critical 
points. However there are two places the derivative is undefined. The terms 


1 5/3 
es 


are undefined at x = 6,0 respectively. Hence x = 0,6 are singular points. These 
split the real line into the intervals (—2%,0), (0,6) and (6,<). 


e When x < 0, (x — 6) < 0, we have that (x —6)~*? < 0 and x- < 0 and so 
f'(x) = —4- (negative) - (negative) < 0. 

e When 0 < x < 6, (x — 6) < 0, we have that (x — 6)~*? < 0 and x-'? > O and so 
be aie a ¢) 

e When x > 6, (x — 6) > 0, we have that (x — 6)~/? > 0 and x~'? > 0 and so 
f(x) <0. 


e We should also examine the behaviour of the derivative as x — 0 and x — 6. 


xl/3 


lim f'(x) = —4 ( lim (x — 6°) ( lim i) = —00 
x—0- x—0- x—0- 


is HA A aaae ) fe Be ae 
ae (x) (aim, e ) Bares ‘és set 
lim f'(x) = —4 ( lim (x — 6°) ( lim at) = +00 
x6 x67 x67 


: / = : —5/3 : —1/3 — 
ple (2) =o (sim, c e) ) (sim, ‘. ) =e ey 

We already know that x = 6 is a vertical asymptote of the function, so it is not 
surprising that the lines tangent to the graph become vertical as we approach 6. 
The behavior around x = 0 is less standard, since the lines tangent to the graph 
become vertical, but x = 0 is not a vertical asymptote of the function. Indeed the 
function takes a finite value y = f(0) = 0. 


e Summarising all this 


(—00,0) 0 (0,6) 6 (6, 0) 
f'(x) | negative DNE positive DNE negative 
vertical | . : vertical ; 
decreasing tangents increasing asymptote decreasing 
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(3) Now look at f”(x): 


Oof! 


4 


8/3 8 
e Both of the factors (4) = ( 44) and an = (+) are even powers and 


so are positive (though possibly infinite). So the sign of f”(x) is the same as the 
sign of the factor x — 1. Thus 


(—o0,1) 1 (1, 00) 
Ff" (x) negative 0 positive 
inflection 
concave down Soint concave up 


Here is a sketch of the graph y = f(x). 


SS 


F Ye 
f'<0, f decreasing —f’>0 f’<0, f decreasing 


It is hard to see the inflection point at x = 1, y = f(1) te in the above sketch. So here 
is a blow up of the part of the sketch around x = 1. 
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(1, 1/9/35) 


And if we zoom in even more we have 


(1,1/ 7/25) 


———————————— Example 3615 


3.7 4 L’H6pital’s Rule and indeterminate forms 


Let us return to limits (Chapter 1) and see how we can use derivatives to simplify cer- 
tain families of limits called indeterminate forms. We know, from Theorem 1.4.2 on the 
arithmetic of limits, that if 

lim f(x) =P lim g(x) =G 


x—a x—a 


and G 4 0, then 
ity _ 


von g(x) G 
The requirement that G ¥ 0 is critical — we explored this in Example 1.4.6. Please reread 
that example. 
Of course”? it is not surprising that if F 4 0 and G = 0, then 


58 Now itis not so surprising, but perhaps back when we started limits, this was not so obvious. d 
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and if F = 0 but G 4 0 then 
fe) _ 


xa g(x) 


However when both F,G = 0 then, as we saw in Example 1.4.6, almost anything can 
happen 


2 . x . 1 
= = —— —=DNE 
fla) =a g(x) = ling 2 = jim = DN 
2 
2 x 
fe oi eg 
1o=2 a) =x lim = = lim1=1 
x— x— 
Tx 7 7 
= 2 _ 2 : _ _ 
Pe Sie) = OF ae 


Indeed after exploring Example 1.4.11 and 1.4.13 we gave ourselves the rule of thumb that 
if we found 0/0, then there must be something that cancels. 

Because the limit that results from these 0/0 situations is not immediately obvious, but 
also leads to some interesting mathematics, we should give it a name. 


Let a € Rand let f(x) and g(x) be functions. If 


lim f(x) =0 


x—Aa 


then the limit 


is called a 9/0 indeterminate form. 


There are quite a number of mathematical tools for evaluating such indeterminate 
forms — Taylor series for example. A simpler method, which works in quite a few cases, 
is L’'H6pital’s rule”. : 


59 Named for the 17th century mathematician, Guillaume de l’H6pital, who published the first textbook d 


on differential calculus. The eponymous rule appears in that text, but is believed to have been devel- 
oped by Johann Bernoulli. The book was the source of some controversy since it contained many results 
by Bernoulli, which l’H6pital acknowledged in the preface, but Bernoulli felt that "Hopital got undue 
credit. 

Note that around that time |’H6pital’s name was commonly spelled l’Hospital, but the spelling of 
silent s in French was changed subsequently; many texts spell his name l’Hospital. If you find your- 
self in Paris, you can hunt along Boulevard de l’H6pital for older street signs carved into the sides of 
buildings which spell it “l’Hospital” — though arguably there are better things to do there. 
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Theorem 3.7.2 (L’H6pital’s Rule). 


Let a € IR and assume that 


lim f(x) = lim g(x) =0 


x—Aa x—Aa 


Then 
(a) if f’(a) and g’(a) exist and g’(a) 4 0, then 


(b) while, if f’(x) and 9’(x) exist on an open interval that contains a, and if the 
limit 


Proof. We only give the proof for part (a). The proof of part (b) is not very difficult, but 
uses the Generalised Mean—Value Theorem (Theorem 3.4.38), which is optional and most 
readers have not seen it. 


e First note that we must have f(a) = g(a) = 0. To see this note that since derivative 
f'(a) exists, we know that the limit 


exists 


in f—F@) 


xa x— 


Since we know that the denominator goes to zero, we must also have that the nu- 
merator goes to zero (otherwise the limit would be undefined). Hence we must have 


lim (f(x) — f(a) = (lim f(x)) — f(a) =0 


x—Aa (lim 


We are told that lim f(x) = 0 so we must have f(a) = 0. Similarly we know that 
g(a) = 0. 
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e Now consider the indeterminate form 


on LA) — Ji £4) = 9 fie 
[Gs 2 a0 use 0 = f(a) = g(a) 
— fm LD = FC) : _ (=a)? 
~ BB g(x) = g(a) multply PY T= aay 
= lim EI) Cote rearrange 
xa g(x) — g(a) (x—a) 
f(x) — f(a) 
- lim WG) oa use arithmetic of limits 
sam fx) — Fla) 
xa x-—a a 


We can justify this step and apply Theorem 1.4.2, since the limits in the numerator 
and denominator exist, because they are just f’(a) and 9’(a). 


>>> Optional — proof of part (b) of l’H6pital’s rule 


To prove part (b) we must work around the possibility that f’(a) and 9/(a) do not exist 
or that f’(x) and g’(x) are not continuous at x = a. To do this, we make use of the Gen- 
eralised Mean-Value Theorem (Theorem 3.4.38) that was used to prove Equation (3.4.33). 
We recommend you review the GMVT before proceeding. 

For simplicity we consider the limit 


ie) 


1m 
x—at &(Xx) 


By assumption, we know that 


lim f(x) = lim g(x) =0 


x—at x—at 


Since f and g are continuous at x = a, we have f(a) = g(a) = 0. This allows us to write 


f(x) _ f(x) — fla) 


g(x) s(x) — g(a) 
which is the right form for an application of the GMVT. 
By assumption f’(x) and g(x) exist in some open interval around a, so we know that 
they exist in some interval (a,b]. Then the GMVT (Theorem 3.4.38) tells us that for x € 
(4,b] 
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where c € (a,x). As we take the limit as x — a, we also have that c — a, and so 


ee CeO me nO 


= hm = hm 
xoat (xX) xsat gi(c) cat gi(c) 


as required. 


>>> Back to the main text 


3.7.1 »» Standard examples 


Here are some simple examples using L’H6pital’s rule. 


om Example 3.7.3 aes | 


Consider the limit 


. sinx 
lim 
x-0 X 


e Notice that 


lim sin x = 0 


x 
lim x = 0 
x0 


so this is a 9/0 indeterminate form, and suggests we try l’H6pital’s rule. 
e To apply the rule we must first check the limits of the derivatives. 
f(x) = sink f(x) 08% and F'(0).= 
g(x) =x g(x) =1 and £0) = 


e So by l’H6pital’s rule 


Me Example 3.7.3 =) 
om Example 3.7.4 p_—VuSaan___aaa_a_a_aanaan>_——_ i] 


Consider the limit 


sin(x) 
im — 
x0 sin(2x) 


e First check 


lim sin 2x = 0 


x— 
lim sinx = 0 


x2 


so we again have a 9/0 indeterminate form. 
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e Set f(x) = sinx and g(x) = sin 2x, then 


f (x) cos x f'(0) =1 
(x) = 20s 2x: /(0) =2 


e And by |’H6pital’s rule 


Cape 3.7.4 -_J 


Example 3.7.5 


Let g > 1 and compute the limit 


x 
—1 

lim 4 

x0 x 


This limit arose in our discussion of exponential functions in Section 2.7. 
e First check 
lim(q* —1) =1-1=0 
x0 


lim x = 0 


x= 


so we have a 9/0 indeterminate form. 


e Set f(x) = q* — x and g(x) = x, then (maybe after a quick review of Section 2.7) 


1 - (q° —1) = q" -logq f'(0) = logq 
g(x) =1 g(0)=1 


e And by |’H6pital’s rule®? 


= log q. 


60 While it might not be immediately obvious, this example relies on circular reasoning. In order to apply 4 


l’H6pital’s rule, we need to compute the derivative of q*. However in order to compute that limit (see 
Section 2.7) we needed to evaluate this limit. 


A more obvious example of this sort of circular reasoning can be seen if we use |’H6pital’s rule to 


compute the derivative of f(x) = x” at x = a using the limit 
n n n—-1 
Ip ® ae RO AI OY id 
f (a) lim a lim = na". 


We have used the result at = nx"! to prove itself! 
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tC Example 3.7.5 __J 


In this example, we shall apply L’H6pital’s rule twice before getting the answer. 


I Example a 


Compute the limit 


coy 
lim eutas ) 
x0 1—cosx 
e Again we should check 
lim sin(x*) = sin0 = 0 
x—0 


lim (1 —cosx) =1-—cos0=0 
xX 


and we have a 9/0 indeterminate form. 


Let f(x) = sin(x?) and g(x) = 1—cosx then 
f'(x) = 2x cos(2x”) f'(0) = 
g'(x) = sinx gO) 
So if we try to apply I’H6pital’s rule naively we will get 


i sin(x*) _ f’(0) _ 0 


x-01—cosx (0) 0 


which is another 9/0 indeterminate form. 


It appears that we are stuck until we remember that l’H6pital’s rule (as stated in 
Theorem 3.7.2) has a part (b) — now is a good time to reread it. 


It says that 


lim ie) = lim f(x) 


x0 B(x) x0 g(x) 


provided this second limit exists. In our case this requires us to compute 


2 
lim 2a cos(x ) 
x>0 sin(x) 


which we can do using l’H6pital’s rule again. Now 


h(x) = 2x cos(x*) h' (x) = 2cos(x?) — 4x? sin(x?) KO) = 
ie) =sin(x) Ux) =cos(x) C0) = 
By l’H6pital’s rule 


2xcos(x?)  h’(0) , 
x0) sin(x)  @/(0) 
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e Thus our original limit is 


sin(x*) ___ 2x cos(x?) 
m ——_ = lim ——~— 
x-01—cosx x30 sin(x) 


e Wecan succinctly summarise the two applications of L’H6pital’s rule in this example 


by 
: 2 2 2 Qo 2 
sin( x _ 2x cos(x . 2cos(x*) —4x* sin(x 
mr Si(32) _ yn 2 C08(27) _ 4, 2cos(x?) (2) _, 
x-01—cosx x0 sinx x0 cos x 
—<——— —S 
num—0 num—0 num—2 
den—0 den—0 den—1 


Here “num” and “den” are used as abbreviations of “numerator” and “denomina- 
tor” respectively.” 


CN _______eeeseseses__§eEz 3.7.6 = 


One must be careful to ensure that the hypotheses of l’H6pital’s rule are satisfied before 
applying it. The following “warnings” show the sorts of things that can go wrong. 


Warning 3.7.7 (Denominator limit nonzero). 


lim g(x) 40 


x—a 


fila) og f(x) 


or lim 


x8 g!(x) 


m fix) need not be the same as 


xa g(x) g(a) 


Here is an example. Take 


o(4) =44-5x 


3x _ 3x0 _ 
x-044+5x 445x0 
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Warning 3.7.8 (Numerator limit nonzero). 


lim f(x) 40 


x—a 


f(x) 


m—— need not be the same as 


fit) 


md BI) Bod B(x) 


Here is an example. Take 


4+ 5x _ DNE 
3x 


This next one is more subtle; the limits of the original numerator and denominator 
functions both go to zero, but the limit of the ratio their derivatives does not exist. 
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Warning 3.7.9 (Limit of ratio of derivatives DNE). 


If 
lim f(x) =0 and lim g(x) =0 


but 


f(x) 


lim ; does not exist 
x0 g(x) 
then it is still possible that 
lim pe) exists. 
xa g(x) 
Here is an example. Take 
oe A 
c= f(x) =a sin ~ G(x) =o 


Then (with an application of the squeeze theorem) 


lim f(x) =0 and lim g(x) = 0. 


x0 
If we attempt to apply l’Héptial’s rule then we have g(x) = 1 and 


1 at 
$= 2x sin — — cos — 


and we then try to compute the limit 


/ 
lim f (x) = lim (2s sin 2 — cos =) 
x x 


x0 g(x) x—0 


However, this limit does not exist. The first term converges to 0 (by the squeeze 
theorem), but the second term cos(1/.x) just oscillates wildly between +1. All we 
can conclude from this is 


Since the limit of the ratio of derivatives does not exist, we cannot 
apply l’H6pital’s rule. 


Instead we should go back to the original limit and apply the squeeze theorem: 


De ceo, 
x = 1 
fi ti ee 0. 
x309(xX) x30 x x0 x 


since |x sin(1/x)| < |x| and |x| > 0 as x — 0. 
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It is also easy to construct an example in which the limits of numerator and denomi- 
nator are both zero, but the limit of the ratio and the limit of the ratio of the derivatives do 
not exist. A slight change of the previous example shows that it is possible that 


lim f(x) =0 and lim g(x) =0 
but neither of the limits 


lim —— or 


exist. Take 
ae 
a=0 f(x) = xsin ~ ox) =a 


Then (with a quick application of the squeeze theorem) 


lim f(x) =0 and lim g(x) = 0. 
x0 x0 
However, 
ini 
meets) = lim acs lim sin — 
x0 g(x) x0 x x0 


does not exist. And similarly 


ee eee rer 
fig 
x0 @/(x) x0 ks 


does not exist. 


3.7.2 » Variations 


Theorem 3.7.2 is the basic form of L’H6pital’s rule, but there are also many variations. 
Here are a bunch of them. 


(a) L’H6pital’s rule also applies when the limit of x — a is replaced by lim, orby lim or 
x—Aa x 


—a— 


by eas a PY im. 


We can justify adapting the rule to the limits to +co via the following reasoning 


tim, ee) = ee Gun substitute x = 1/y 
— ph (1/y) 
= lim A 
yor =e L/y) 


3.7 L’HOPITAL’S RULE AND INDETERMINATE FORMS 


APPLICATIONS OF DERIVATIVES 


where we have used |’H6pital’s rule (assuming this limit exists) and the fact that 
a f(l/y) = —4f'(1/y) (and similarly for g). Cleaning this up and substituting 
y = 1/x gives the required result: 
/ 7 
lim f(x) = lim F(/y) = lim EASY, 
x00 B(x) yor g’(1/y) x0 g'(x) 
im Example 3.7.10 _=_<—$——S—__—7 
Consider the limit 
arctan x — 5 


lim 
x—0O 1/x 


Both numerator and denominator go to 0 as x — ©, so this is an 9/0 indeterminate 


form. We find 
1 
__arctanx— #4 : qe ; 1 
lim += lim HY =- im > =-!1 
x—+00 e xX—>+00 es x—-+0 ] + 2 
Y/Y N Dy 
num—0 num—1 
den—0 den—1 


We have applied L’H6pital’s rule with 


f(x) = arctan x — 5 g(x) = A 
oe. jag 3 oh 
Campi 3.7.10 J 
(b) 2% indeterminate form: L'H6pital’s rule also applies when lim f (x) =0, lim g(x) =0is 
replaced lim f(x) =o lim g(x) = +0. 
c= Example 3.7.11 as | 
Consider the limit 
log x 


lim 
xX XG 


The numerator and denominator both blow up towards infinity so this is an ©/c in- 


determinate form. An application of l’H6pital’s rule gives 


. logx . l/x 
lim sect S lim —— 
x—0 x x—0 1 
num—-oo 
den— 00 
= lim —=0 
X->0O X 
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Campi 3.7.11 pas 
cm Example 3.7.12 —————___-4 


Consider the limit 


5x7 +3x—3 
hn MH,,_ 
x0 = x2 +1 


Then by two applications of l’H6pital’s rule we get 


5x2 +3x-3 |. 10x+3 |. 10 
—.——— = | = lim = 


lim er 
x0 = x2 4-1 x0 2x x00 2 
eS NS 
num—o num—0o 
den— 00 den— 00 


Mi 3.7.12 J 
om Example 3.7.13 rs | 


Compute the limit 


lim a 
x0+ tan (F — x) 


We can compute this using |’H6pital’s rule twice: 


log x 1 ey cos*(F — x) 
x0+ tan(Z—x)  x0+ —sec*(F—x) x04 x 
num 2 en 
vs = Gite 2cos(z — x) sin(s —x) _ F 
x 0+ 1 
S$ 
num—0 
den—1 
The first application of L’H6pital’s was with 
f(%)= loge - e(x) = tan (= _ x) 
1 
/ = / —— 2: Wot > 
f= g(x) =—sec (F-+) 
and the second time with 
f(x) = cos” (> - x) (x) 


pey=2o0(-s)[-an(E-s)]en ge =1 
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Campi 3.7.13 J 


Sometimes things don’t quite work out as we would like and |’H6pital’s rule can get 
stuck in a loop. Remember to think about the problem before you apply any rule. 


(Bane 3.7.14 a 


Consider the limit 


. ete 
i$. 


x0 eX —e-* 


Clearly both numerator and denominator go to «, so we have a ©/a indeterminate 
form. Naively applying l’Hd6pital’s rule gives 

ee ee a 

lim. ———— = lim 


x00 eX —e-* x00 eX + e-* 
which is again a ©/oo indeterminate form. So apply l’Hopital’s rule again: 


eX — e-* ; ex +e% 


x00 e* + e~* x0 et — eX 


which is right back where we started! 
The correct approach to such a limit is to apply the methods we learned in Chapter 1 
and rewrite 


ex +e-* e*(1+e-2*) _ 1+ e°2* 


pi e=* = ex(1 = e— 2x) > 1 =e 


and then take the limit. 


A similar sort of l’H6pital-rule-loop will occur if you naively apply l’H6pital’s rule to 
the limit 


which appeared in Example 1.5.6. 


Campi 3.7.14 = 


>>> Optional — proof of l’H6pital’s rule for ©/« forms 


We can justify this generalisation of l’H6pital’s rule with some careful manipulations. 
Since the derivatives f’, g’ exist in some interval around a, we know that f, ¢ are con- 


338 


APPLICATIONS OF DERIVATIVES 3.7 L’HOPITAL’S RULE AND INDETERMINATE FORMS 


tinuous in some interval around q; let x,t be points inside that interval. Now rewrite?! 
F(x) _ fe), (= | | (S-40 oe) 


7 g(x) g(x) g(x)—g(f) g(x) — g(t) 
—S SS 


fe —FO FO, (fe) _ 10 fis) f0) 
& 


g(x) — g(t) g(x) (x) g(x) g(x) —8(#) 
++. SS 
ready for GMVT we can clean it up 


g(x) g(x) 


=H , 10) , (F=f fle) f10) 
g(x)—g(t) g(x) g(x) g(x) — g(t) 
Ff) FO (1 ae 
= (a) 98) * g(x) (sm =a) U(x) ~ fle) 
_FR)-FO FO, (8-8-8) pr 

= (a) et) * g(x) Garcencd, U(x) ~ fb) 


ready for GMVT ready for GMVT 


Oof! Now the generalised mean-value theorem (Theorem 3.4.38) tells us there is a c 
between x and f so that 


fix) - f(t) _ £'O) 


g(x) g(t) s'(c) 
Now substitute this into the large expression we derived above: 
7 7 
9) £10 5 (10-49-80) 
g(x) gi(c) g(x) si(c) 
At first glance this does not appear so useful, however if we fix ft and take the limit as 
x — a, then it becomes 


il + lim : Ne). 
lin Fay = Hi tes tM ge (FO gy 8) 


Since g(x) — 0 as x — a, this last term goes to zero 


Now take the limit as t — a. The left-hand side is unchanged since it is independent 
of t. The right-hand side, however, does change; the number c is trapped between x 
and t. Since we have already taken the limit x — a, so when we take the limit t — a, 
we are effectively taking the limit c — a. Hence 


which is the desired result. 


This is quite a clever argument, but it is not immediately obvious why one rewrites things this way. 
After the fact it becomes clear that it is done to massage the expression into the form where we can 
apply the generalised mean-value theorem (Theorem 3.4.38). 
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>>> Back to the main text 


(c) 0-« indeterminate form: When lim f(x) = 0 and lim g(x) = «%. We can use a little 
x—a x—a 


algebra to manipulate this into either a 4 or = form: 


lim f(x) lim g(x) 
vn T/ g(x) en / f(x) 


(- Example 3.7.15 i 


Consider the limit 


lim x -logx 
x—0F 


Here the function f(x) = x goes to zero, while g(x) = log x goes to —«. If we rewrite 
this as the fraction 


log x 
x-logx = 
6 17x 
then the 0 - « form has become an ©/« form. 
The result is then 
. : log x : ; . 
lim x | logx = lim 6* = lim ~-=-— lim x =0 
x304¢+ AW ES x0+ x0+ —4 x—0+ 
a0) ey x x 
num——©oo 
den— co 
Campi S715 __J 
Example 3.7.16 
In this example we’ll evaluate lim x"e~*, for all natural numbers n. We'll start with 
Bs”) 
n = 1and n = 2 and then, using what we have learned from those cases, move on to 
general n. 
2 x ty 
Fim. Se = lm — = lm — = lim e*=0 
x3 +0 SO Se” xX—>+00 e* x—>+o0 er X— +00 
0 0 = —“ 
num—-+00 num—1 
den—-+00 den— +00 


Applying l’H6pital twice, 


2 
; i x ; 2x ; z ; 
lm x7 e* = lm = = lm = = lm = ©&= lim 2e*=0 
x>+0 ~~ Sia +H x>+0 e% x—>+0 e% x—+00 
00 0 Sa So —— 
num—>--0o num—0o num—2 
den—>-++oo den—+00 den— +00 
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Indeed, for any natural number n, applying |’Hopital n times gives 


n 


lim x” e* = lim — 
X>+0 SW Sx tm et 
> —0 
num—-+00 
den—+00 
; nx" 1 
= lim 


x— +00 ex 


num—o 


den—>+0 
—2 
. n(n—1)x" 
tn MHD 
x—+00 ex 
«e_—_|_ —_“—_’ 
num—>0o 
den—+00 
: n! 
= — lim — = 


num—n! 
den—-+00 


Cape 3.7.16 J 


(d) «© — cw indeterminate form: When lim f(x) = and lim g(x) = oo. We rewrite the 
x—a x—Aa 


difference as a fraction using a common denominator 


bs mae Cre) 


which is then a 9/0 or ©/oo form. 


mz Example 3.7.17 — DU i 


Consider the limit 


lim (sec x — tan x) 


Since the limit of both sec x and tan x is +00 asx — 5, this is an «0 — 0 indeterminate 
form. However we can rewrite this as 


1 sinx _ 1—sinx 


sec x —tanx = 
cosx cosx cos x 


which is then a 9/0 indeterminate form. This then gives 


. : 1—sinx . —cos x 
lim secx — tanx } = lim ——— = lim —— =0 
MD YAS ays Seeds Pra B., Goea ae 
num—0 num—0 
den—0 den—-—1 
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In the last example, Example 3.7.17, we converted an oo — oo indeterminate form into 
a 5 indeterminate form by exploiting the fact that the two terms, sec x and tan x, in 
the co — co indeterminate form shared a common denominator, namely cos x. In the 
“real world” that will, of course, almost never happen. However as the next couple of 
examples show, you can often massage these expressions into suitable forms. 


Here is another, much more complicated, example, where it doesn’t happen. 


Example 3.7.18 


In this example, we evaluate the « — co indeterminate form 


un a. log(1+ x) ) 


We convert it into a § indeterminate form simply by putting the two fractions, + and 
GEES] over a common denominator. 

1 log(1 _ 
im (pga) = ER Sota 
— KH 08 x x—> og 


(3.7.1) 


—+00 num—>0 


Now we apply L’H6pital’s rule, and simplify 


log(l+x)-x _,, m-1 1-—(1+x) 
m = lim = 
x0 xlog(1+ x) x0log(1+x)+7qy x20(1+x)log(1+x)+x 
+ ——_ 


num—0 
den—0 


— lim (3.7.2) 
x0 


(1+ x)log(1+x)+x 
ee 


num—0 
den—1x0+0=0 


Then we apply L’H6pital’s rule a second time 
x 1 


(1+ x)log(1+x)+x x0 log(1+x)+ #41 2 
————————_—TC( ee 


(3.7.3) 


lim 
x—0 


num—0 


den—>1x0-40=0 num—1 


den—0+1+4+1=2 


Combining (3.7.1), (3.7.2) and (3.7.3) gives our final answer 


, 1 1 1 
lim (; a= a 


Cape 3.7.18 _) 
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The following example can be done by |’H6pital’s rule, but it is actually far simpler to 
multiply by the conjugate and take the limit using the tools of Chapter 1. 


im Example 3.7.19 TNT T07A0ZT7Tg 


Consider the limit 


lim x2 + 4x — \/x2 — 3x 
xX—0O 
Neither term is a fraction, but we can write 


af x2 + Ax —4/ x2 — 3x = 20/1 + 4/x — x0/1 —3/x assuming x > 0 
x(V1+4/x /1 3/x) 


_ Jit 4/a— V1 3/x 
7 1/x 


which is now a 9/0 form with f(x) = /1+4/x —V/1—3/x and g(x) =1/x. Then 


fils) = Ae s(x) = 
2V1+4/x 2V/1-3/x x? 
Hence 
f(x) 4 _ 3 


g(x) 2/1+4/x  /1—3/x 


And so in the limit as x — oo 


_ f(x) 4 ed 
mh gia) 22 


and so our original limit is also 7/2. 


By comparison, if we multiply by the conjugate we have 


x2 + Ax + x2 — 3x 
2 2 = 2 2 ; 
a/ x + 4x —/x —3x = (vx + 4x —/x — 3x) eg Laas 


x? + 4x — (x? — 3x) 
Vx2 + 4x + Vx? — 3x 
7X 


V/x2 + Ax + x2 — 3x 


7 
= assuming x > 0 


V¥1+4/x+ V1-3/x 


Now taking the limit as x — © gives 7/2 as required. Just because we know I|’H6pital’s 
rule, it does not mean we should use it everywhere it might be applied. 


Example 3.7.19 __J 


343 


APPLICATIONS OF DERIVATIVES 3.7 L’HOPITAL’S RULE AND INDETERMINATE FORMS 


(e) 1” indeterminate form: We can use |’H6pital’s rule on limits of the form 


lim f (x)8 with 


lim f(x) =1 and lim g(x) = % 


x—a x—Aa 


by considering the logarithm of the limit®*: 


log (im f(x)8) = lim log (f(x)8“)) = lim log (f(x) - g(x) 
which is now an 0- o form. This can be further transformed into a 9/0 or ©/o form: 


log (lim f(x)s) = lim log (f 


x—a 


Example 3.7.20 


The following limit appears quite naturally when considering systems which display 
exponential growth or decay. 


lim (1+.x)’* with the constant a 4 0 
x— 


Since (1+ x) > land a/x — ~ this is an 1% indeterminate form. 


By considering its logarithm we have 


log (tim + af) = lim log ((1 + =) 


_ a 
= lim ; os( +x) 


= lim 
x0 


alog(1+ x) 
x 


which is now a 9/0 form. Applying |’H6pital’s rule gives 


lim alee) lim = = 


a 
x0 x x0 
ee 
num—>0 num—a 
den—0 den—1 


Since (1+ x)*/* = exp Hlog ((1 hog re )| and the exponential function is continuous, 


Example 3.7.20 =) 
62 Weare using the fact that the logarithm is a continuous function and Theorem 1.6.10. i 


our original limit is e’. 
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Here is a more complicated example of a 1” indeterminate form. 


(= Example 3.7.21 [————— {| 


In the limit 


the base, §"*, converges to 1 (see Example 3.7.3) and the exponent, s goes to 0. But 
if we take logarithms then 


log (e=*) ve _ log “ 


x x2 


then, in the limit x — 0, we have a 9/0 indeterminate form. One application of 
l’H6pital’s rule gives 


inx x xcosx—sinx x x—sin x : 
feng MOR castun SE ge ee x cos x — sinx 
lim —~*~— = lim = lim —72* — = lim or 
x0 Xx x0 2x x0 2x x>0 2x4sinx 
—— 
num—0 
den—0 


which is another 9/0 form. Applying |’H6pital’s rule again gives: 


xcosx —sinx . cosx—xsinx—cosx 
—_.__— = lim 


lim - = : 
x>0 2x2sinx x0 4xsinx +2x2cosx 
eS 
num—0 
den—0 


: xsin x : sin x 
lim - 5 = — lim — 
x30 4x sin x + 2x4 cos x x30 4sin x + 2xcosx 


which is yet another 09/0 form. Once more with l’H6pital’s rule: 


. sin Xx ; cos Xx 
lim — lim - 
x0 4sinx + 2x cos x x0 4cosx +2cosx —2xsinx 
es -_ ——xsccqwqwqwr 


—~— 
num—0 num—1 
den—0 den—6 


1 


6 


Oof! We have just shown that the logarithm of our original limit is —1/6. Hence the 
original limit itself is e~!/°. 


This was quite a complicated example. However it does illustrate the importance of 
cleaning up your algebraic expressions. This will both reduce the amount of work you 
have to do and will also reduce the number of errors you make. 


Cape 3.7.21 __J 
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(f) 0° indeterminate form: Like the 1” form, this can be treated by considering its loga- 
rithm. 


(Bane 3.7.22 $0 


For example, in the limit 


lim x* 
x—0+ 


both the base, x, and the exponent, also x, go to zero. But if we consider the logarithm 
then we have 


log x* = xlogx 


which is a 0 - «© indeterminate form, which we already know how to treat. In fact, we 
already found, in Example 3.7.15, that 
lim xlogx =0 


x—0+ 


Since the exponential is a continuous function 


jim x jim exp (x og x) exp (_lim x 0g x) e 


Campi 3.7.22 __J 


(g) «0° indeterminate form: Again, we can treat this form by considering its logarithm. 


im Example 3.7.23 $M 


For example, in the limit 


‘a l/+ 
lim x'/ 
x— +00 


the base, x, goes to infinity and the exponent, 1 goes to zero. But if we take logarithms 


log x 
x 


x 


log eS 


which is an ©/o form, which we know how to treat. 


1 
; log x : = 
li 5* _ lim = =0 
X—+00 x x—+00 1 
num—c0o num—0 
den— oo den—1 


Since the exponential is a continuous function 


lim x’/* = lim exp (28%) = exp ( lim 28%) ==] 


X—+00 xX—+00 


TOWARDS MATHEMATICS 101 


4.1 4 Introduction to antiderivatives 


We have now come to the final topic of the course — antiderivatives. This is only a short 
section since it is really just to give a taste of the next calculus subject: integral calculus. 

So far in the course we have learned how to determine the rate of change (i.e. the 
derivative) of a given function. That is 


6...) 
given a function f(x) find or 
Along the way we devloped an understanding of limits, which allowed us to define in- 
stantaneous rates of change — the derivative. We then went on to develop a number of 
applications of derivatives to modelling and approximation. In this last section we want 
to just introduce the idea of antiderivatives. That is 
df 


given a derivative ax find the original function f (x). 


For example — say we know that 


df» 
ae 


and we want to find f(x). From our previous experience differentiating we know that 
derivatives of polynomials are again polynomials. So we guess that our unkown function 
f(x) is a polynomial. Further we know that when we differentiate x” we get nx"~! — 
multiply by the exponent and reduce the exponent by 1. So to end up with a derivative of 
x? we need to have differentiated an x°. But i = 3x2, so we need to divide both sides 
by 3 to get the answer we want. That is 


(5?) =? 
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However we know that the derivative of a constant is zero, so we also have 


and 


At this point it will really help the discussion to give a name to what we are doing. 


A function F(x) that satisfies 


is called an antiderivative of f(x). 


Notice the use of the indefinite article there — an antiderivative. This is precisely 
because we can always add or subtract a constant to an antiderivative and when we dif- 
ferentiate we'll get the same answer. We can write this as a lemma, but it is actually just 
Corollary 2.13.12 (from back in the section on the mean-value theorem) in disguise. 


Lemma 4.1.2. 


Let F(x) be an antiderivative of f(x), then for any constant c, the function F(x) + 


c is also an antiderivative of f(x). 


Because of this lemma we typically write antiderivatives with “+c” tacked on the end. 
That is, if we know that F’(x) = f(x), then we would state that the antiderivative of f(x) 
is 


Flaite 


where this “+c” is there to remind us that we can always add or subtract some constant 
and it will still be an antiderivative of f(x). Hence the antiderivative of x? is 


Similarly, the antiderivative of x* is 


TOWARDS MATHEMATICS 101 4.1 INTRODUCTION TO ANTIDERIVATIVES 


and for \/x = x!/? it is 

23/2 

3x +e 
This last one is tricky (at first glance) — but we can always check our answer by differen- 
tiating. 


d 2 3/2 2 3 1/2 
= +C = 3 a* +0 Vv 


Now in order to determine the value of c we need more information. For example, we 
might be asked 


Given that 9’(t) = #* and ¢(3) = 7 find g(t). 


We are given the derivative and one piece of additional information and from these two 
facts we need to find the original function. From our work above we know that 
1 


g(t) = ah te 


and we can find c from the other piece of information 


1 
7=9(8)= 3 27 tc=9te 


Hence c = —2 and so 


13 
t)= =P -2 
0 coat 
We can then very easily check our answer by recomputing ¢(3) and 9’(t). This is a good 
habit to get into. 
Finding antiderivatives of polynomials is generally not too hard. We just need to use 
the rule 


1 


if f(x) =x" then F(x) = —=x"*" +. 
if f(x) = x” then F(x) aie i qe 
Of course this breaks down when n = —1. In order to find an antiderivative for f(x) = + 
we need to remember that Le logx = 1 and more generally that a log |x| = 1 See 


Example 2.10.4. So 


if f(x) = ~ then F(x) = log|x|+c 


Example 4.1.3 


Let f(x) = 3x° — 7x* + 2x +3+x7!—x-*. Then the antiderivative of f(x) is 


3 7 2 1 
=e at + 3x 4 log|x|—x"' +e clean it up 


F(x) 


“6 3 
1 7 
= 5x ae x7 4+ 3x +log |x| +x t+c 
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Now to check we should differentiate and hopefully we get back to where we started 


P(x) = $25 7 Bx? +20 +3-+— a 
my 7x? 42x +34 - is v 


a Example 4.1.3 =) 


In your next calculus course you will develop a lot of machinery to help you find 
antiderivatives. At this stage about all that we can do is continue the sort of thing we 
have done. Think about the derivatives we know and work backwards. So, for example, 
we can take a list of derivatives 


F(x) a sinx | cosx | tanx | e* | In|x| | arcsinx | arctan x 
f(x) = LF(x) || 0 | nax"-! | cosx | —sinx | sec?x | e* | 3 —S aw 
and flip it upside down to give the tables 
f(x) = LF(x) || 0) nx"1)  cosx —sinx | sec*x ex 1 
Pe) c|x"+ec]|sinx+c | cosxt+ec | tanx+c | e*+c | In|x|+c 
eed. 1 1 
fo=-H@] 2 | = 
P(x) arcsinx +c | arctanx +c 


of antiderivatives. Here c is just a constant — any constant. But we can do a little more; 
clean up x” by dividing by n and then replacing n by n + 1. Similarly we can tweak sin x 
by multiplying by —1: 


f(x) = LF(x) || 0 at cos x sin x sec? x ex + 
F(x) c | atyx"tl +c | sinx+c | —cosx+c | tanxt+c | e%+c | In|x|+c 
_ d 1 1 
f(x) = SF (x) af =? T+ x2 
Pee) arcsinx +c | arctanx +c 


Here are a couple more examples. 


ic Example 4.1.4 i 


Consider the functions 


f(x) =sinx + cos2x 


Find their antiderivatives. 
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Solution. The first one we can almost just look up our table. Let F be the antiderivative of 
f, then 

F(x) = —cosx+sin2x+c is not quite right. 


When we differentiate to check things, we get a factor of two coming from the chain rule. 
Hence to compensate for that we multiply sin 2x by 3: 


1 
F(x) = —cosx+ 5 sin2x +¢ 


Differentiating this shows that we have the right answer. 
Similarly, if we use G to denote the antiderivative of g, then it appears that G is nearly 
arctan x. To get this extra factor of 4 we need to substitute x +> 2x. So we try 


G(x) = arctan(2x) +c which is nearly correct. 


Differentiating this gives us 


2 


ei) 1+ (2x)? 


= 28(x) 
Hence we should multiply by 7 This gives us 
G(x) = 5 arctan(2x) as 


We can then check that this is, in fact, correct just by differentiating. 


Example 414} 


Now let’s do a more substantial example. 


Example 4.1.5 


Suppose that we are driving to class. We start at x = 0 at time t = 0. Our velocity is 
v(t) = 50sin(t). The class is at x = 25. When do we get there? 


Solution. . Let’s denote by x(t) our position at time t. We are told that 
e x(0) =0 
e x(t) = 50sint 


We have to determine x(t) and then find the time T that obeys x(T) = 25. Now armed 
with our table above we know that the antiderivative of sin t is just — cos t. We can check 
this: 


= (— cost) = sint 
dt - 
We can then get the factor of 50 by multiplying both sides of the above equation by 50: 


_ (—50 cost) = 50sint 
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And of course, this is just an antiderivative of 50sint; to write down the general an- 
tiderivative we just add a constant c: 


d 
al (—50cost+c) = 50sint 


Since v(t) = 4x(t), this antiderivative is x(t): 
x(t) = —50 cost +c 

To determine c we make use of the other piece of information we are given, namely 

x0); =O, 
Substituting this in gives us 

x(0) = —50cos0 +c = —50+c¢ 
Hence we must have c = 50 and so 

x(t) = —50cost +50 = 50(1 — cosf). 


Now that we have our position as a function of time, we can determine how long it 
takes us to arrive there. That is, we can find the time T so that x(T) = 25. 


25 = x(T) = 50(1 —cosT) sO 
; =1-cosT 
= = —cosT 
; = cos T. 


Recalling our special triangles, we see that T = 3. 


Example 4.1.5 __J 


Another example which shows how antiderivatives arise naturally when studying dif- 
ferential equations. 


Example 4.1.6 (Theorem 3.3.2 revisited.) 


Back in Section 3.3 we encountered a simple differential equation, namely equation 3.3.1. 
We were able to solve this equation by guessing the answer and then checking it carefully. 
We can derive the solution more systematically by using antiderivatives. 

Recall equation 3.3.1: 


dQ _ 


oo 


Jo2 
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where Q(t) is the amount of radioactive material at time t and we assume Q(t) > 0. Take 
this equation and divide both sides by Q(t) to get 


gO: 


Q(t) dt 


At this point we should! think that the left-hand side is familiar. Now is a good moment 
to look back at logarithmic differentiation in Section 2.10. 


The left-hand side is just the derivative of log Q(t): 


d { a6 
qz (los QC) =O) de 
— 


So to solve this equation, we are really being asked to find all functions log Q(t) having 
derivative —k. That is, we need to find all antiderivatives of —k. Of course that is just 
—kt + c. Hence we must have 

log Q(t) = —kt+c 
and then taking the exponential of both sides gives 


Q(t) = ektte en pt, et = Ce Kt 


where C = e°. This is precisely Theorem 3.3.2. 


Example 416} 


The above is a small example of the interplay between antiderivatives and differential 
equations. 

Here is another example of how we might use antidifferentiation to compute areas or 
volumes. 


Example 4.1.7 


We know (especially if one has revised the material in the appendix and Appendix B.4.4 
in particular) that the volume of a right-circular cone is 


V= rh 


where /1 is the height of the cone and r is the radius of its base. Now, the derivation of this 
formula given in Appendix B.4.4 is not too simple. We present an alternate proof here that 
uses antiderivatives. 


1 Well — perhaps it is better to say “notice that”. Let’s not make this a moral point. 
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= lower bound 


{x —X — upper bound 


Consider cutting off a portion of the cone so that its new height is x (rather than /h). 
Call the volume of the resulting smaller cone V(x). We are going to determine V(x) for 
all x > 0, including x = h, by first evaluating V’(x) and V(0) (which is obviously 0). 

Call the radius of the base of the new smaller cone y (rather than r). By similar triangles 
we know that 


Now keep x and y fixed and consider cutting off a little more of the cone so its height is X. 
When we do so, the radius of the base changes from y to Y and again by similar triangles 
we know that 


The change in volume is then 
V(x) — V(X) 


Of course if we knew the formula for the volume of a cone, then we could compute the 
above exactly. However, even without knowing the volume of a cone, it is easy to derive 
upper and lower bounds on this quantity. The piece removed has bottom radius y and top 
radius Y. Hence its volume is bounded above and below by the cylinders of height x — X 
and with radius y and Y respectively. Hence 


mY (x= XS Via) = Vix) <r eX) 


since the volume of a cyliner is just the area of its base times its height. Now massage this 
expression a little 


V(x) - V(X) 2 


p gape 
a x—X 


The middle term now looks like a derivative; all we need to do is take the limit as X — x: 


lim 7tY* < lim Ee < lim zy? 
Xx Xx x—-X Xx 
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The rightmost term is independent of X and so is just 7ty”. In the leftmost term, as X — x, 
we must have that Y — y. Hence the leftmost term is just 7ty*. Then by the squeeze 
theorem (Theorem 1.4.17) we know that 


But we know that 


SO 
dV - ( r ) 2 2 
dx 

Now we can antidifferentiate to get back to V: 


Vig = GE aie 


To determine c notice that when x = 0 the volume of the cone is just zero and so c = 0. 
Thus 


vin = F(R) * 


and so when x = ht we are left with 


as required. 


Ea ____s+__uUnt Example 417; 
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Appendix A 


HIGH SCHOOL MATERIAL 


This chapter is really split into two parts. 
e Sections A.1 to A.11 we expect you to understand and know. 


e The very last section, Section A.12, contains results that we don’t expect you to mem- 
orise, but that we think you should be able to quickly derive from other results you 
know. 


A.1 4 Similar triangles 


b a 


[o\ x 


& C 
Two triangles T;, T are similar when 


e (AAA — angle angle angle) The angles of T; are the same as the angles of T. 


e (SSS — side side side) The ratios of the side lengths are the same. That is 


e (SAS — side angle side) Two sides have lengths in the same ratio and the angle 
between them is the same. For example 


ae © 
ae and angle f is same 
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A.2 4 Pythagoras 


For a right-angled triangle the length of the hypotenuse is related to the lengths of the 
other two sides by 


2 


opposite 


adjacent)” + (opposite)? = (hypotenuse 
J PP yp 


adjacent 


A.3 4 Trigonometry — definitions 


See: opposite Pr a 1 
g hypotenuse ~ sin @ 
3 djacent 1 
= oe. gece = 
fs) hypotenuse cos 0 
it 1 
tan@d = tl se cot@ = 
adjacent adjacent tan 0 


A.4 4 Radians, arcs and sectors 


For a circle of radius r and angle of @ radians: 
e Arc length L(6) = 76. 
e Area of sector A(0) = ie 
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A.5 TRIGONOMETRY — GRAPHS 


A.5 4 Trigonometry — graphs 


sin @ cos @ tan @ 
1 A» 
3x It —1 5 on = aaa 7 3 27 
= eC 
A.6 4 Trigonometry — special triangles 
2 V3 : 
Ln 
1 
From the above pair of special triangles we have 
Sg eae aoa sin ~ “se 
ae 5.2 S 2 
fs ese ae: 
OF A — aie 6 2 3° OD 
1 
tant =1 tan t= tan = v3 


A.7 4 Trigonometry — simple identities 


e Periodicity 
sin(@ + 271) = sin(@) 


e Reflection 


sin(—0) = — sin(6) 


e Reflection around 71/4 


cos(6 + 27r) = cos(6) 


cos(—0) = cos(@) 


cos (F — 0) =siné 


oo? 


HIGH SCHOOL MATERIAL A.8 TRIGONOMETRY — ADD AND SUBTRACT ANGLES 


e Pythagoras 


sin? 6+ cos*6 =1 


A.8 « Trigonometry — add and subtract angles 
° Sine 
sin(a + B) = sin(«) cos(B) + cos(«) sin(B) 
© Cosine 


cos(a + B) = cos(«) cos(B) F sin(«) sin(B) 


A.9 4 Inverse trigonometric functions 


Some of you may not have studied inverse trigonometric functions in highschool, how- 
ever we still expect you to know it by the end of the course. 


arcsin x arccos x arctan x 


Domain: —1 <x <1 Domain: —1 <x <1 Domain: all real numbers 


Range: —5 <arcsinx < 5 | Range: 0 <arccosx <7 | Range: —5 < arctanx < 4 


K 
A mam + % 


7 


He -—————————_?———_————————— 
5 + 


NI 


a4 =———_—_— 

Since these functions are inverses of each other we have 
7 7 
arcsin(sin@) = 6 —-~<@< = 
( ) 2 2 
arccos(cos@) = 6 VL 0a 3% 
7 7 
arctan(tan@) = 0 —-~<0<= 
2 2 

and also 

sin(arcsinx) = x -l<x<l 
cos(arccos x) = x -l<x<l 
tan(arctan x) = x any real x 
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A.10 4 Areas 


FIZNO® 


e Area of a rectangle 


e Area of a triangle 


e Area of a circle 


e Area of an ellipse 


A = mab 


A.11 4 Volumes 


e Volume of a rectangular prism 


V =Ilwh 
e Volume of a cylinder 
V=nr’h 
e Volume of a cone 
V= srrh 
e Volume of a sphere 
V = —_7r 


HIGH SCHOOL MATERIAL A.12 YOU SHOULD BE ABLE TO DERIVE 


A.12 « Highchool material you should be able to derive 


e Graphs of csc 6, sec @ and cot @: 


csc 6 sec @ cot é 


az 2 1 5 sz 2 az AS us 3 27 tT —F ze 3sz\ 2 


e More Pythagoras 


. divide by cos* 0 
sin? 6 + cos?6 =1 ee tan? 6 +1 = sec? 0 
F divide by sin? 0 
sin? 6 +. cos?6@ = 1 hla deeds Gea 


1+ cot” 6 = csc* 6 
e Sine — double angle (set 6 = a in sine angle addition formula) 
sin(2a) = 2 sin(«) cos(a@) 


e Cosine — double angle (set 6 = « in cosine angle addition formula) 


cos(2a) = cos*(a) — sin*(«) 
= 2cos*(#) —1 (use sin?(#) = 1 —cos*(«)) 


= 1-2sin?(a) (use cos”(a) = 1 — sin?(«)) 


e Composition of trigonometric and inverse trigonometric functions: 


cos(arcsin x) = 1 — x? sec(arctan x) = \/1+ x2 


and similar expressions. 
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ORIGIN OF TRIG, AREA 


AND VOLUME FORMULAS 


B.1 4 Theorems about triangles 


B.1.1 » Thales’ theorem 


We want to get at right-angled triangles. A classic construction for this is to draw a triangle 
inside a circle, so that all three corners lie on the circle and the longest side forms the 
diameter of the circle. See the figure below in which we have scaled the circle to have 
radius 1 and the triangle has longest side 2. 


C 
O 


Thales theorem states that the angle at C is always a right-angle. The proof is quite 
straight-forward and relies on two facts: 


e the angles of a triangle add to zr, and 
e the angles at the base of an isosceles triangle are equal. 


So we split the triangle ABC by drawing a line from the centre of the circle to C. This 
creates two isosceles triangles OAC and OBC. Since they are isosceles, the angles at their 
bases a and 6 must be equal (as shown). Adding the angles of the original triangle now 
gives 


m=a+(a+$)+B=2(a+ B) 
So the angle at C = 7 — (a+ B) = 7/2. 
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B.1.2 » Pythagoras 


Since trigonometry, at its core, is the study of lengths and angles in right-angled triangles, 
we must include a result you all know well, but likely do not know how to prove. 


opposite 


adjacent 


The lengths of the sides of any right-angled triangle are related by the famous result due 
to Pythagoras 


ee 


There are many ways to prove this, but we can do so quite simply by studying the follow- 
ing diagram: 


b 


2 


white area = a? + b? white area = c 


We start with a right-angled triangle with sides labeled a,b and c. Then we construct a 
square of side-length a + b and draw inside it 4 copies of the triangle arranged as shown 
in the centre of the above figure. The area in white is then a? + b?. Now move the triangles 
around to create the arrangement shown on the right of the above figure. The area in white 
is bounded by a square of side-length c and so its area is c?. The area of the outer square 
didn’t change when the triangles were moved, nor did the area of the triangles, so the 


white area cannot have changed either. This proves a? + b? = c?. 


B.2 «4 Trigonometry 


B.2.1 » Angles — radians vs degrees 


For mathematics, and especially in calculus, it is much better to measure angles in units 
called radians rather than degrees. By definition, an arc of length @ on a circle of radius 
one subtends an angle of @ radians at the centre of the circle. 
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A(9) 
» 


The circle on the left has radius 1, and the arc swept out by an angle of @ radians has 
length @. Because a circle of radius one has circumference 27t we have 


27 radians = 360° m radians = 180° 7/2radians = 90° 
” radians = 60° ” radians = 45° 7 radians = 30° 
3 4 6 
More generally, consider a circle of radius r. Let L(@) denote the length of the arc swept 


out by an angle of @ radians and let A(@) denote the area of the sector (or wedge) swept 
out by the same angle. Since the angle sweeps out the fraction 9/27 of a whole circle, we 


have 
L(0) = 279 oe Or and 
— 2a 
6 60 
A(@) =r oa oF 


B.2.2 » Trig function definitions 


The trigonometric functions are defined as ratios of the lengths of the sides of a right- 
angle triangle as shown in the left of the diagram below . These ratios depend only on the 
angle 0. 


tan@ 


opposite 


adjacent ————- cos 9 


The trigonometric functions sine, cosine and tangent are defined as ratios of the lengths 
of the sides 


opposite adjacent opposite sin 0 
PP cos @ = es aaa tan @ = PP = 


sin @ = ———__ ; = ‘ 
hypotenuse hypotenuse adjacent cosé 
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These are frequently abbreviated as 


: Oo a oO 
sin? = 7 CS ane 
which gives rise to the mnemonic 
SOH CAH TOA 


If we scale the triangle so that they hypotenuse has length 1 then we obtain the diagram 
on the right. In that case, sin @ is the height of the triangle, cos @ the length of its base and 
tan @ is the length of the line tangent to the circle of radius 1 as shown. 

Since the angle 27t sweeps out a full circle, the angles @ and @ + 27 are really the same. 


tan @ 


Hence all the trigonometric functions are periodic with period 27c. That is 
sin(@ + 27r) = sin(6@) cos(@ + 27r) = cos(6) tan(@ + 271) = tan(@) 


The plots of these functions are shown below 


tan 6 


A 


7 3 2 


The reciprocals (cosecant, secant and cotangent) of these functions also play important 
roles in trigonometry and calculus: 


csc 8 = — we. er 1 _h each 1 _cos@_a 
sind o cos @ a 


tan@d  sin@ o 
The plots of these functions are shown below 
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csc 6 sec @ cot é 
A 
1 
t > 
ey 5 3m —1 a el | ue 3 on T 8 z 3 2 


These reciprocal functions also have geometric interpretations: 


cot 0 


Since these are all right-angled triangles we can use Pythagoras to obtain the following 
identities: 


sin? @ + cos?@ = 1 tan?@+1 = sec? 1+ cot” 6 = csc* 6 
Of these it is only necessary to remember the first 
sin*@+cos*@ = 1 
The second can then be obtained by dividing this by cos” 6 and the third by dividing by 


sin? 0. 


B.2.3 » Important triangles 


Computing sine and cosine is non-trivial for general angles — we need Taylor series (or 
similar tools) to do this. However there are some special angles (usually small integer 
fractions of 7c) for which we can use a little geometry to help. Consider the following two 
triangles. 
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The first results from cutting a square along its diagonal, while the second is obtained by 
cutting an equilateral triangle from one corner to the middle of the opposite side. These, 
together with the angles 0, 5 and 7 give the following table of values 


0 sin@ | cos@ | tan@ || csc@ | sec@ | coté 
0 rad 0 1 0 DNE 1 DNE 
5 rad 1 0 | DNE 1 DNE 0 
7 rad 0 -1 0 DNE|} -1 | DNE 
7 rad yr a 1 4/2 /2 1 
maf i[*l a1? [a] % 
Trad |) i V3 * 2 a 


B.2.4 » Some more simple identities 


Consider the figure below 


5 (cos 8, sin @) 


. F (cos 6, —sin 6) 


The pair triangles on the left shows that there is a simple relationship between trigono- 
metric functions evaluated at @ and at —@: 


sin(—0) = — sin(6) cos(—@) = cos(@) 


That is — sine is an odd function, while cosine is even. Since the other trigonometric 
functions can be expressed in terms of sine and cosine we obtain 


tan(—0) = —tan(@) csc(—@) = —csc(@) sec(—0) =sec(@) cot(—@) = —cot(@) 
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Now consider the triangle on the right — if we consider the angle + — @ the side-lengths of 
the triangle remain unchanged, but the roles of “opposite” and “adjacent” are swapped. 
Hence we have 


sin (| — 0) =cosé cos (| — 0) =siné 
Again these imply that 
tan($—@) =cot@ csc($—0)=sec@ sec($—0@)=cscA cot(F—@) = tand 
We can go further. Consider the following diagram: 


ra (— cos 6, sin 8) 


. F (cos 6, — sin 0) 


», (— cos 6, — sin @) 


This implies that 
sin(7t — 0) = sin(@) cos(7t — 6) = —cos(6) 
sin(7z +6) = —sin(@) cos(7t + 6) = —cos(6) 


From which we can get the rules for the other four trigonometric functions. 


B.2.5 » Identities — adding angles 
We wish to explain the origins of the identity 


sin(a + B) = sin(«) cos(B) + cos(«) sin(B). 


A very geometric demonstration uses the figure below and an observation about areas. 


qcos 3 = cosa 


e The left-most figure shows two right-angled triangles with angles a and f and both 
with hypotenuse length 1. 
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e The next figure simply rearranges the triangles — translating and rotating the lower 
triangle so that it lies adjacent to the top of the upper triangle. 


e Now scale the lower triangle by a factor of g so that edges opposite the angles a and 
f are flush. This means that q cos B = cosa. ie 


COS & 


~ Os p 
Now compute the areas of these (blue and red) triangles 
1,5. 
Aaa 54 sin 6 cos B 
Soe 
Pili = 5 sin & COS a 
So twice the total area is 
2Atotal = SINa# COS & + q’ sin 6 cos B 
e But we can also compute the total area using the rightmost triangle: 
2Atotal = 4 sin(a + B) 
Since the total area must be the same no matter how we compute it we have 


qsin(« + B) = sina cosa +q’ sin B cos B 


sin(a + B) = “sinacose +qsin B cos B 


cosB . COs & 
= sin 4 COs & + 
cos & cos B 


sin 6 cos B 
= sina cos 6 + cosa sin B 


as required. 
We can obtain the angle addition formula for cosine by substituting a > 77/2 — a and 
6 +> —B into our sine formula: 


sin(a + 6) = sin(«) cos(B) + cos(«) sin(B) becomes 
sin(7t/2 —«— B) = sin(7/2 — a) cos(—B) + cos(7/2 — «) sin(—B) 
2 sos eB) cos(«) sin(@) 


cos(a + B) = cos(«) cos(B) — sin(a) sin(B) 


where we have used sin(7r/2 — 6) = cos(@) and cos(7/2— 6) = sin(@). 
It is then a small step to the formulas for the difference of angles. From the relation 


sin(a + B) = sin(«) cos(B) + cos(«) sin(B) 
we can substitute 6 > — 6 and so obtain 
sin(a — B) = sin(«) cos(—B) + cos(a) sin(—B) 
= sin(a) cos(B) — cos(«) sin(B) 
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The formula for cosine can be obtained in a similar manner. To summarise 


sin(a + B) = sin(w) cos(B) + cos(a) sin(B) 
cos(# + B) = cos(«) cos(B) + sin(«) sin(B) 


The formulas for tangent are a bit more work, but 


tan(a +p) = Sa th 
__ sin(«) cos(B) + cos(«) sin(B) 
~ cos(«) cos(B) — sin(w) sin(B) 
__ sin(a) cos(B) + cos(a) sin(B) sec(«) sec(B) 
~ cos(a) cos(B) — sin(a) sin(B) , sec(«) sec(B) 
__ sin(a) sec(a) + sin(B) sec(B) 
~ 1—sin(a) sec(w) sin(B) sec(B) 
__ tan(a) + tan() 
~ 1—tan(a) tan(B) 
and similarly we get 
__ tan(w) — tan(B) 
uC ae 2. 1+ tan(«) tan() 
B.2.6 » Identities — double-angle formulas 
If we set 6 = a in the angle-addition formulas we get 
sin(2a) = 2 sin(a) cos(«) 
cos(2m) = cos*(a) — sin?(«) 
= 2cos*(w) —1 since sin? @ = 1 — cos” 
= 1-2sin’(w) since cos” @ = 1— sin” 
_ 2tan(«) 
tan(2a) = 1 —taneta) 
2 


= cot(a) — tan(a) divide top and bottom by tan(«) 


B.2.7 » Identities — extras 


>>> Sums to products 


Consider the identities 
sin(a + 6) = sin(«) cos(B) + cos(a) sin(B) sin(a« — B) = sin(«) cos(B) — cos(a) sin(B) 
If we add them together some terms on the right-hand side cancel: 


sin(a + 6) + sin(a — B) = 2sin(«) cos(B). 
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If we now set u =a+Bhandv =a—6B (ie. a = 4,6 = 45°) then 


sin(u) + sin(v) = 2sin (“S cos ( (- — =) 
This transforms a sum into a product. Similarly: 


sin(u) — sin(v) = 2sin (|) a (* 5 
cos(u) + cos(v) = 2cos (F*) iad C z >) 


satiate (at 


>>> Products to sums 


Again consider the identities 
sin(a + B) = sin(«) cos(B) + cos(a) sin(B) sin(a — B) = sin(w) cos(B) ~ cos(a) sin() 
and add them together: 
sin(a + B) + sin(a — B) = 2sin(«) cos(B). 
Then rearrange: 


Siata\eos(B) = sin(« + B) = — B) 


In a similar way, start with the identities 
cos(a + B) = cos(a) cos(B) — sin(«) sin(B) cos(a — B) = cos(a) cos(B) + sin(«) sin(B) 
If we add these together we get 
2 cos(w) cos(B) = cos(a + B) + cos(a — B) 
while taking their difference gives 


2 sin(a) sin(B) = cos(a — 6) — cos(a + B) 


Hence 
sin(a) sin(B) = cos(a — B) 5 cos(« + B) 
cos(a) cos(B) = cos(a ~ B) aml + B) 
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B.3 « Inverse trigonometric functions 


In order to construct inverse trigonometric functions we first have to restrict their domains 
so as to make them one-to-one (or injective). We do this as shown below 


sin 0 cos 6 tan 6 
Domain:0<@<7z Domain: =a <O@< 5 
Range: —1 <cos@ <1 Range: all real numbers 
1 * A 

> > > 

z ue a -|z 

-1/ 
arcsin x arccos x arctan x 

Domain: —1 <x <1 Domain: —1 <x <1 Domain: all real numbers 
Range: —4 < arcsinx < 5 Range: 0 < arccosx <7 | Range: —5 < arctanx < 5 


A A 


at 


—_—— iu — 


mL 
2 


NIA 


Since these functions are inverses of each other we have 


er 7 TT 
arcsin(sin@) = 0 me <O0< > 
arccos(cos@) = 0 0<0<7 
arctan(tan@) = 6 ey 
2 2. 
and also 
sin(arcsinx) = x -1<x<l 
cos(arccos x) = x -l<x<l 
tan(arctan x) = x any real x 
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We can read other combinations of trig functions and their inverses, like, for example, 
cos(arcsin x), off of triangles like 


We have chosen the hypotenuse and opposite sides of the triangle to be of length 1 and x, 
respectively, so that sin(@) = x. That is, @ = arcsin x. We can then read off of the triangle 
that 


cos(arcsin x) = cos(@) = V1 — x? 
We can reach the same conclusion using trig identities, as follows. 


e Write arcsin x = 6. We know that sin(@) = x and we wish to compute cos(@). So we 
just need to express cos(@) in terms of sin(@). 


e To do this we make use of one of the Pythagorean identities 
sin* @ + cos*@ = 1 
cos@ = +V1—sin*6 
e Thus 


cos(arcsinx) = cos@ = +V/1~— sin? 6 


e To determine which branch we should use we need to consider the domain and 
range of arcsin x: 


Domain: -1<x<1 Range: — — < arcsinx < 


NIA 
NIA 


Thus we are applying cosine to an angle that always lies between — ¥ and +. Cosine 
is non-negative on this range. Hence we should take the positive branch and 


cos(aresin x) = V1 —sin?@ = V1 — sin’ (arcsin x) 
=V1-x 


In a very similar way we can simplify tan(arccos x). 


e Write arccos x = 6, and then 


sin @ 


tan(arccos x) = tan@ = 
( ) cos 8 
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e Now the denominator is easy since cos @ = cosarccos x = X. 
e The numerator is almost the same as the previous computation. 


sin@ = +1/1— cos? 6 
= ++/1— x? 


e To determine which branch we again consider domains and and ranges: 
Domain: —1<x <1 Range: 0 < arccos x < 71 


Thus we are applying sine to an angle that always lies between 0 and 7. Sine is 
non-negative on this range and so we take the positive branch. 


e Putting everything back together gives 


/1 — x2 
tan(arccos x) = ae 
Completing the 9 possibilities gives: 
x 
sin(arcsin x) = x sin(arccos x) = 1/1 — x? sin(arctan x) = ————— 
(arcsin x) (arccos x) = / (arctan x) = —*— 
1 
cos(arcsinx) = 1/1 — x2 cos(arccosx) = x cos(arctan x) = ————— 
x V1 — x2 
tan(arcsin x) = ——— ___ tan(arccos x) = ———— _ tan(arctanx) =x 
V1 — x? x 


B.4 « Geometry 


B.4.1 » Cosine law or Law of cosines 


The cosine law says that, if a triangle has sides of length a, b and c and the angle opposite 
the side of length c is y, then 


c* = a*+b* —2abcosy 


Observe that, when y = a7 this reduces to, (surpise!) Pythagoras’ theorem c2 = a2 +P?. 
Let’s derive the cosine law. 
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Consider the triangle on the left. Now draw a perpendicular line from the side of 
length c to the opposite corner as shown. This demonstrates that 


c =acosB+ bcosa 


Multiply this by c to get an expression for c?: 


c? = accos B + be cosa 


Doing similarly for the other corners gives 


a” = accos B + abcos y 


b? = be cosa + abcos y 
Now combining these: 


a’? + b* — c? = (bc — bc) cosa + (ac — ac) cos B + 2abcos y 
= 2abcos y 


as required. 


B.4.2 » Sine law or Law of sines 


The sine law says that, if a triangle has sides of length a,b and c and the angles opposite 
those sides are w, 6B and y, then 


a 0 c 


sina sinB  siny 


This rule is best understood by computing the area of the triangle using the formula A = 
5ab sin @ of Appendix A.10. Doing this three ways gives 


2A = besina 
2A = acsin B 
2A = absiny 


Dividing these expressions by abc gives 


2A sina sinf _ siny 
abc a b Cc 


as required. 
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B.4.3 » Where does the formula for the area of a circle come from? 


Typically when we come across 7t for the first time it is as the ratio of the circumference of 
a circle to its diameter 


Indeed this is typically the first definition we see of 7t. It is easy to build an intuition that 
the area of the circle should be propotional to the square of its radius. For example we 
can draw the largest possible square inside the circle (an inscribed square) and the smallest 
possible square outside the circle (a circumscribed square): 


The smaller square has side-length V2r and the longer has side-length 27. Hence 


A 
2’ <A<4r or2<5<4 
That is, the area of the circle is between 2 and 4 times the square of the radius. What 
is perhaps less obvious (if we had not been told this in school) is that the constant of 
propotionality for area is also 77: 


oe A 
=> 

We will show this using Archimedes’ proof. He makes use of these inscribed and 
circumscribed polygons to make better and better approximations of the circle. The steps 
of the proof are somewhat involved and the starting point is to rewrite the area of a circle 
as 


iL 
A= 5Cr 


where C is (still) the circumference of the circle. This suggests that this area is the same as 
that of a triangle of height r and base length C 


eee 
— 
C C 
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Archimedes’ proof then demonstrates that indeed this triangle and the circle have the 
same area. It relies on a “proof by contradiction” — showing that T < A and T > A 
cannot be true and so the only possibility is that A = T. 

We will first show that T < A cannot happen. Construct an n-sided “inscribed” poly- 
gon as shown below: 


OOE 


Let pn be the inscribed polygon as shown. 


Pn P2n 


We need 4 steps. 


1. The area of py is smaller than that of the circle — this follows since we can construct 
Pn by cutting slices from the circle. 


2. Let E, be the difference between the area of the circle and py: E, = A— A(pn) (see 
the left of the previous figure). By the previous point we know E,, > 0. Now as we 
increase the number of sides, this difference becomes smaller. To be more precise 


1 
Eon < 5 En. 


The error E, is made up of n “lobes”. In the centre-left of the previous figure we 
draw one such lobe and surround it by a rectangle of dimensions a x 2b — we could 
determine these more precisely using a little trigonometry, but it is not necessary. 


This diagram shows the lobe is smaller than the rectangle of base 2b and height a 
Since there are n copies of the lobe, we have 


E 
Ey <n x 2ab rewrite as = < nab 


Now draw in the polygon p2, and consider the associated “error” E2,. If we focus 
on the two lobes shown then we see that the area of these two new lobes is equal to 
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that of the old lobe (shown in centre-left) minus the area of the triangle with base 2b 
and height a (drawn in purple). Since there are n copies of this picture we have 


Eo, = En — nab now use that nab > E,,/2 
En Ey, 
< E, -— = — 
a. 2 


3. The area of py is smaller than T. To see this decompose py» into n isosceles triangles. 
Each of these has base shorter than C/n; the straight line is shorter than the corre- 
sponding arc — though strictly speaking we should prove this. The height of each 
triangle is shorter than r. Thus 


A(pn) =n x 5 (base) x (height) 


< —=T 
og 


4. If we assume that T < A, then A—T = d where d is some positive number. However 
we know from point 2 that we can make n large enough so that E, < d (each time 
we double n we halve the error). But now we have a contradiction to step 3, since 
we have just shown that 


E, = A-A(pn) < A-T which implies that 
A(pn) > T. 
Thus we cannot have T < A. 


If we now assume that T > A we will get a similar contradiction by a similar construc- 
tion. Now we use regular n-sided circumscribed polygons, Pr. 


OO® 


The proof can be broken into 4 similar steps. 


1. The area of P,, is greater than that of the circle — this follows since we can construct 
the circle by trimming the polygon Py. 
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2. Let E; be the difference between the area of the polygon and the circle: En = A(Pn) — 
A (see the left of the previous figure). By the previous point we know E, > 0. Now 
as we increase the number of sides, this difference becomes smaller. To be more 
precise we will show 


1 


The error E, is made up of n “lobes”. In the centre-left of the previous figure we 
draw one such lobe. Let L,, denote the area of one of these lobes, so E,, = nL,. In 
the centre of the previous figure we have labelled this lobe carefully and also shown 
how it changes when we create the polygon P2n. In particular, the original lobe is 
bounded by the straight lines ad, a if and the arc fod bd. We create Pon from P, by cutting 
away the corner triangle Aaec. Accordingly the lines ec and ba are orthogonal and 
the segments |bc| = |cd]. 
By the construction of P2, from P,, we have 

2Lon = Ly — A(Aaec) or equivalently Lo, = al — A(Aabc) 
And additionally 

Lon < A(Abcd) 

Now consider the triangle Aabd (centre-right of the previous figure) and the two 
triangles within it Aabe and Abcd. We know that ab and cb form a right-angle. 
Consequently ac is the hypotenuse of a right-angled triangle, so |ac| > |bc| = |cd|. So 


now, the triangles Aabc and Abcd have the same heights, but the base of ac is longer 
than cd. Hence the area of Aabc is strictly larger than that of Abcd. 


Thus we have 
Loy < A(Abcd) < A(Aabc) 


But now we can write 


i — ali — A(Aabc) < ae — Loy rearrange 
Dp oli there are 1 such lobes, so 
2nLon < shu since E, = nLy, we have 
Eon < sen which is what we wanted to show. 


3. The area of P;, is greater than T. To see this decompose P,, into 1 isosceles triangles. 
The height of each triangle is r, while the base of each is longer than C/n (this is a 
subtle point and its proof is equivalent to showing that tan @ > @). Thus 


ALP) — wx 5 (base) x (height) 


a 
> ——— sh 
2UX 
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4. Ifwe assume that T > A, then T— A = d where d is some positive number. However 
we know from point 2 that we can make n large enough so that E, < d (each time 
we double n we halve the error). But now we have a contradiction since we have 
just shown that 

E, = A(P,) -A<T—-A which implies that 
A(pn) > T. 


Thus we cannot have T > A. The only possibility that remains is that T = A. 


B.4.4 » Where do these volume formulas come from? 


We can establish the volumes of cones and spheres from the formula for the volume of a 
cylinder and a little work with limits and some careful summations. We first need a few 
facts. 


e Every square number can be written as a sum of consecutive odd numbers. More 
precisely 


a1 es On=1) 


e The sum of the first n positive integers is 5n(n +1). That is 


e The sum of the squares of the first n positive integers is {n(n +. 1)(2n +1). 


1 
1? BP asad nm = en(n +1)(2n +1) 


We will not give completely rigorous proofs of the above identities (since we are not going 
to assume that the reader knows mathematical induction), rather we will explain them 
using pictorial arguments. The first two of these we can explain by some quite simple 
pictures: 


We see that we can decompose any square of unit-squares into a sequence of strips, each 
of which consists of an odd number of unit-squares. This is really just from the fact that 


nw =(n—1)° =2n=1 
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Similarly, we can represent the sum of the first n integers as a triangle of unit squares 
as shown. If we make a second copy of that triangle and arrange it as shown, it gives a 
rectangle of dimensions n by n + 1. Hence the rectangle, being exactly twice the size of 
the original triangle, contains n(n + 1) unit squares. 

The explanation of the last formula takes a little more work and a carefully constructed 
picture: 


| | | 


Let us break these pictures down step by step 
e Leftmost represents the sum of the squares of the first n integers. 


e Centre — We recall from above that each square number can be written as a sum of 
consecutive odd numbers, which have been represented as coloured bands of unit- 
squares. 


e Make three copies of the sum and arrange them carefully as shown. The first and 
third copies are obvious, but the central copy is rearranged considerably; all bands 
of the same colour have the same length and have been arranged into rectangles as 
shown. 


Putting everything from the three copies together creates a rectangle of dimensions 
(20-1) * (.+24+3-4+-04 on), 
We know (from above) that1 +2+3+-::-+n= 5n(n +1) and so 


11 
(P+ 2° 4+.--+0°) = 5 x n(n +1)(2n +1) 


as required. 

Now we can start to look at volumes. Let us start with the volume of a cone; con- 
sider the figure below. We bound the volume of the cone above and below by stacks of 
cylinders. The cross-sections of the cylinders and cone are also shown. 
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To obtain the bounds we will construct two stacks of n cylinders, Ci,C2,...,Cn. Each 
cylinder has height h/n and radius that varies with height. In particular, we define cylin- 
der C;, to have height h/n and radius k x r/n. This radius was determined using similar 
triangles so that cylinder C,, has radius r. Now cylinder C; has volume 


kr\? hh 
Vi = 7 x radius” x height = 7 ( =). 
n n 
ae 
— me k 


We obtain an upper bound by stacking cylinders Cj, C2,...,C, as shown. This object 
has volume 


a ee 


_ Tr *h 2 2 yp ao ee: 
= 2 (P 42434-4277) 
_ mrh n(n+1)(2n+1) 
8 6 


A similar lower bound is obtained by stacking cylinders C1,...,C,—1 which gives a vol- 
ume of 


V=V,4+Vo+...Vy_-1 


= HE (p42, BP este Dy 1?) 
_ mrh (n—1)(n)(2n —-1) 
Sr at Z 


Thus the true volume of the cylinder is bounded between 


2 = _ 2 
mr“h (n—1)(n)(2n — 1) ee ee rs h n(n+1)(2n +1) 
n> 6 n> 6 
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We can now take the limit as the number of cylinders, n, goes to infinity. The upper bound 
becomes 


2 2 4 
lm = hn(n+1)(2n+1) _ mrh lim n(n +1)(2n +1) 


no ns 6 6 n—>0 ns 
2 s 
_ arch jena (1+1/n)(2+1/n) 
n> 1 
2 
_ 7 h 29 
_ rh 
— 3 


The other limit is identical, so by the squeeze theorem we have 


1 
Volume of cone = qrrh 


Now the sphere — though we will do the analysis for a hemisphere of radius R. Again 
we bound the volume above and below by stacks of cylinders. The cross-sections of the 
cylinders and cone are also shown. 


bs 
Tk 

R _ 
” Uk 
R 


To obtain the bounds we will construct two stacks of n cylinders, Ci,C2,...,Cn. Each 
cylinder has height R/n and radius that varies with its position in the stack. To describe 
the position, define 


That is, yx, is k steps of distance ® from the top of the hemisphere. Then we set the k' 
cylinder, C, to have height R/n and radius r, given by 


fe == (8 — yx)? =R oR (l= k/n)- 
= R*(2k/n —k?/n*) 
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as shown in the top-right and bottom-left illustrations. The volume of C; is then 


R 
Via Xx radius~ x height = zt x R2 (2k/n —/n?) x 2 


We obtain an upper bound by stacking cylinders Cj, C2,...,Cy as shown. This object 
has volume 


V=V,4+V04+...V, 


2 1 
= mR. (42434040) (i +22+3%+- +07) 


Now recall from above that 


SO 


y= nad. (MAD _ mnt DGn+)) 


Again, a lower bound is obtained by stacking cylinders Cj,...,C,—1 and a similar anal- 
ysis gives 


n(n—1) n(n—1)(2n—1) 
Nee Gan eae) 


Thus the true volume of the hemisphere is bounded between 


n2 6n3 


rR? (me 1) n(n+ pe +1) 


) < correct volume < 7tR°- (““ +1) n(n+1)(2n+ 2) 


We can now take the limit as the number of cylinders, n, goes to infinity. The upper bound 
becomes 


3 (n(n+1) n(n+1)2n+1)\_ _o3/, n(n+1) n(nt+1)(2n+1) 
fie ( n? 6n3 es n? 6n3 
2 Z 
SR? (i TER 
n® (1-2) = 30 


The other limit is identical, so by the squeeze theorem we have 


2 
Volume of hemisphere = 370R® and so 


4 
Volume of sphere = aR? 
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